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NEW POLICY FOR ABSTRACTS 


The Institute of Mathematical Statistics has a new policy on 
abstracts. They should be submitted in duplicate to the Editor, pref- 
erably on abstract blanks, which can be obtained from the IMS 
Secretary. Abstracts must be received at least 47 days before the 
first day of the meeting at which they are to be presented, indicating 
whether presented by title or in person. (Only one contributed paper 
may be given in person at any one meeting.) They may be printed 
prior to the publication of the report of the meeting. Those received 
by April 30 will appear in the September Annals, by July 31 in De- 
cember, etc. Abstracts should be limited to 200 words or the equiva- 
lent, and should avoid displayed expressions and complicated for- 
mulae. They can be accepted from non-members of the IMS only if 
transmitted by members. 








ALTERNATIVE MODELS FOR THE ANALYSIS OF VARIANCE! 
BY Henry ScHEFFE? 
University of California, Berkeley 


Summary. The terminology is defined and illustrated in Section 1. A little 
historical background not very familiar to statisticians is sketched in Section 2. 
In Section 3 some difficulties about the formulation of random interactions are 
discussed. Section 4 deals with models reflecting a randomization in the experi- 
ment to assign the treatment combinations to finite populations of experimental 
units. 


1. Introduction. A broad survey of the present state of the theory of alternative 
models in the analysis of variance would require a monograph and be soon out- 
moded. The selective approach of this paper has been determined mainly by the 
interests of the writer. His chief interests are in the mathematical models—their 
formulation, motivation, and statistical inference in them. The reader is referred 
to a useful survey by Crump (1951);’ he will find little overlap between the two 
papers. In these papers there is little attempt to deal with the always important 
and often difficult problems of careful tailoring of the models to particular situa- 
tions in the physical world; discussions of such problems may be found in the 
work of Kempthorne and Wilk cited in the References at the end. 

The analysis of variance might be defined as a statistical method for analyzing 
observations assumed to be constituted of linear combinations, subject to a 
certain restriction to be stated below, of effects. (We use the terminology of 
“effects” to include what are usually called the ‘general mean,” “‘main effects,” 
“interactions,” and “errors.”’) The effects—not directly observable quantities— 
are more or less idealized formulations of some properties of interest to the in- 
vestigator in the phenomena underlying the observations. The purpose of the 
analysis is to make inferences about some of the effects, these inferences to be 
valid regardless of the magnitudes of certain other effects, which may be present 
in the linear combinations, and which we may be more desirous of “eliminating” 
than “assessing.” 

The theory of this method naturally has implications about how observations 
should be taken or an experiment planned, i.e., about experimental design. The 
term ‘‘experimental design” is used here in a broad sense to include, for example, 


the application described below of variance-components analysis to the non- 
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experimental science of astronomy to help decide on how many nights observa- 
tions should be taken and how many per night. 

The restriction mentioned above is that the coefficients in the linear combina- 
tions which give the observations in terms of the effects be integers; usually they 
are exclusively 0 and 1. (For an example where some are —1 see Scheffé (1952, 
Sec. 7); for an example where some are 2, see Kempthorne (1952, Sec. 6.8). If 
more than a few different integers occur as coefficients of the same effect it would 
be customary to say we have a case of analysis of covariance or of regression 
analysis instead of analysis of variance; it does not seem worthwhile to attempt 
to draw sharp dividing lines.) 

Each effect is regarded as either an unknown constant or else as a random 
variable, the joint distribution of the random variables being in general not as- 
sumed completely known. If the effect is treated as an unknown constant it is 
called a fixed effect or Model | effect, otherwise it is called a random effect or 
Model II effect. (Some writers apply the terminology of ‘“‘Model II effects’’ only 
to random effects which satisfy certain further distribution assumptions including 
independence and normality.) The diversity of models that have been consid- 
ered for the analysis of variance arises from the possibilities of treating various 
effects as fixed or random and of making various distribution assumptions about 
the random effects. For simplicity we shall always assume that all random effects 
have finite variance (this might not be necessary in a nonparametric approach). 
We may assume all random effects to have zero means by introducing further 
fixed effects if necessary. We shall assume that there is at least one set of random 
effects equal in number to the number of observations, a different, one of which 
appears in each observation, which is called a set of errors. There is usually one 
fixed effect that appears in every observation; if this is present we shall call it 
the additive constant—or the general mean if it is the mean of all the observations 
in some sense. 

The equations expressing the observations as linear combinations of the ef- 
fects will be called the model equations: Together with the distribution assump- 
tions on the random effects and possible side conditions on the fixed effects they 
determine the model. The model will be called a fixed-effects model or Model | 
if the only random effects in the model equations are the error terms; it is called 
a random-effects model or a variance-components model or Model II if all effects, 
except the additive constant if there is one, are random effects. A case falling 
under neither of these categories is called a mixed model. 

We now illustrate the terminology. Imagine an experiment in a factory to 
study the performance of P machines and Q workers. Suppose that all the 
machines, each run by a single worker, produce small parts of the same kind, 
that a large number is produced daily on each machine, and that there is con- 
siderable day-to-day variation for any worker (for some purposes we will treat 
the output as though it were a continuous random variable). Denote by yp, the 
“true” daily output of the gth worker on the pth machine; this differs from the 
observed output by an “error.” We might regard u,, as an idealized long-term 
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daily average for the gth worker on the pth machine after he has reached a 
relatively stable period following a learning stage. Our convention on sub- 
scripts throughout will be that subscripts p, p’, p” range from 1 to P; q, q’, 9’, 
from 1 to Q, etc.; and that when a subscript is replaced by a dot it indicates 
that the arithmetic mean has been taken over that subscript; thus yp. 
doe He/Q, we. = >> D2 Mpo/(PQ). In the familiar way we define the general 
mean to be 


(1.1) h=4UL.., 
the main effect for the pth machine to be 


(1.2) Ap = Mp. — H.. 


the main effect for the gth worker to be 


(1.3) Ba Mee Dos 
and the interaction between the pth machine and the qth worker to be 
(1.4) Yvq Hpq ~ Mp. —~ Bug + H..y 


so that 


(1.5) Mpg UT ap + Ba > Ypa> 


where 


(1.6) oe a, = 0, i 6, = 0, 7 i. = 0 for allq, > linn = 0 for all p. 


Suppose an experiment is contemplated in which each of the workers is put 


for K days on each of the machines. For reasons to become clear later, let us now 
change from the subscripts p, g to 2, 7. Then the output of the jth worker the 
kth day he is on the 7th machine may be written 


(1.7) Yijsk = Mig + Cipx , 


where e; is the “error.”” We shall assume the set of JJK errors {e;;,} to be 
independently distributed with zero means and common variance o; . For some 
purposes a normality assumption on the errors may be added. We shall some- 
times employ the jargon that the J machines are the “J levels of factor A”’ and 
the J workers are the ‘‘J levels of factor B” in the experiment. 

The formulation of the interactions in the other models causes some difficulties 
we wish to postpone; so to keep all the illustrations simple at this point, let us 
assume the interactions between machines and workers are zero. With all 
vij = 0 the model equation then becomes 


(1.8) Yin = w+ a + Bj + Cin, 
and since 

(1.9) Dia: = 0, 

(1.10) > 8; = 0, 
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and E(e;;,) = 0, therefore u is the general mean in an obvious sense. The effects 
are the terms of four kinds on the right of (1.8); all are fixed effects except the 
errors, so this is a fixed-effects model. 

Suppose the J workers in the experiment could be regarded as a random 
sample from a large pool of workers and that we were more interested in making 
inferences about this population of workers than about the J individuals in 
the experiment. Idealizing the population of workers as infinite and assuming 
no interactions we would then be led to the model equation 


(1.11) Yin = wptag tb; + ein, 


where the random variables {b;}, the worker effects, are independently and 
identically distributed. We again assume the errors {e;;,} to be independently 
distributed with zero means and equal variance, and we also assume them 
independent of the {b;}. We may without loss of generality assume the {a;} 
continue to satisfy (1.9), but the {b;} for the J workers in the experiment of 
course no longer satisfy the analogue of (1.10). By adding E(b;) to u we may 
redefine the {b;} and wu so that E(b;) = 0 and u is again the general mean. This 
example is one of a mixed model. 

It would be appropriate to treat the machine effects also as random if the 
machines in the experiment were of the same make and model and could be 
regarded as a random sample from some population of machines which is of 
primary interest. If this population be also idealized as infinite, the machine 
effects are then independently and identically distributed random variables 
{a;}, and again we may without loss of generality assume E(a;) = 0. Since the 
random sample of workers is assumed to be selected independently of the 
random sample of machines, the set of worker effects {b;} is independent of the 
set {a;}. We further assume the set of errors {e;;,} to be independent of the sets 
of effects {a;} and {b;}, and again assume the {e;,} to be independent with 
zero means and equal variance. We now have the model equation 


(1.12) Y ijk — - + a; + b; + Cijk 


for a random-effects model or variance-components model. The latter terminology 
arises from the relation that the variance of an observation is now 


(1.13) oy =o, tosntoa, 


where o, , o4, oz, and a; are the respective variances of the observations, the 
machine (factor A) effects, the worker (factor B) effects, and the errors, and 
the three terms on the right of (1.13) are appropriately called the variance 
components. 

We see that in formulating a model one must ask for each factor whether one 
is interested individually in the particular levels occurring in the experiment or 
primarily in a population from which the levels in the experiment can be regarded 
as a sample: the main effects are accordingly treated as fixed or as random. (It 
is conceivable that for two different purposes the same data might be analyzed 
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according to two different models in which the same main effects are regarded 
as fixed or as random effects.) Interactions between several factors are naturally 


treated as fixed if all these factors have fixed effects, and as random if one or 
more of these factors have random effects. The difficulty already alluded to 
and to be discussed in Section 3 concerns the kind of dependence assumptions 
to be made about the random interactions. While the decision as to whether the 
main effects of any factor, say A, are to be treated as fixed or random obviously 
affects the meaning of the main effects of A and the interactions of sets of factors 
including A, it also affects the meaning of the main effects of the other factors 
and of the interactions of sets of factors not including A: This is because the 
latter main effects and interactions are defined as averages over the levels of A, 
and the decision determines whether these averages are taken over the particular 
levels of A in the experiment, or over a population of levels, of which the levels 
in the experiment are a sample. 

The assumption of independent errors made in this section is not appropriate 
if the errors arise from the random assignment of experimental units to treat- 
ment combinations from finite populations of units; this will be considered in 
Section 4. 

We shall abide by the notational rule followed in this section of denoting fixed 
effects by Greek letters and random effects by Latin letters. 


2. Some history. Fixed-effects models in which the covariance matrix of the 
errors is known up to a scalar factor are special (because of the restriction on the 
coefficients to be integers) cases of the models, sometimes called “linear hy- 
pothesis”’ models, used in the theory of least squares. It is well known that the 
theory of least squares was invented independently and published by Legendre 
(1806) and Gauss (1809; see also Plackett (1949)) in books on astronomical prob- 
lems, so that to these problems we must ascribe the origin of the fixed-effects 
models. It is not so well known that astronomers, long before statisticians, also 
formulated variance-components models. For the references establishing this I 
am indebted to Dr. Churchill Eisenhart. 

Very explicit use of a variance-components model for the one-way layout is 
made by Airy (1861, Part IV), with all the subscript notation‘ necessary for 
clarity. Suppose that on J nights observations are made with a telescope on the 
same phenomenon, J; observations on the ith night. Airy assumes the following 
structure for the jth observation on the 7th night: 


(2.1) Yij oa ut Cy +. 64, 


where u is the general mean or “true” value, and the {c;} and {e;;} are random 
effects with the following meanings: He calls c; the “constant error,’’ meaning 
it is constant on the ith night; we would call it the 7th night effect; it is caused 


* To conform to the notation of this paper I have only changed his capital letters (1861, 
Sec. 118; Sec. 133 in 3rd ed.) to lower case, and I have added the general mean u since he 
writes the equations for the observations minus uz instead of for the observations. 
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by the “atmospheric and personal circumstances” peculiar to the ith night. 
The {e,;} for fixed 7 we would call the errors about the (conditional) mean 
uw + c; on the ith night. It is implied by Airy’s discussion that he assumes all 
the e;; independently and identically distributed, similarly for the c; , that the 
{e:;} are independent of the {c;}, and that all have zero means. Let us denote 
the variances of the {e;;} and the {c;} by o. and o.. 

To decide about his equivalent of the hypothesis ¢: = 0, Airy compares, as 
we would, a between-nights measure of variability with a within-nights measure, 
but he uses different measures than we would. For brevity suppose all J; = J. 
From the observations on the ith night Airy estimates o, by the r.m.s. (root- 
mean-square) estimate—note his use of J — 1 in the denominator 


2.2) Se. ie (yi; — yi.) /(J — 1)}', 


and he then takes the arithmetic average of these to get ¢, > G..i/1, where 
we would use the r.m.s. average. Actually he works with the “probable errors,”’ 
which are a conventional constant, calculated from the normal distribution, 
times the r.m.s. estimates. For the between-nights measure he uses not a function 
of the between-nights sum of squares J >>; (y;, — y..)° but the corresponding 
mean deviation from the mean, 


(2.3) d= I'D: ly —y..|- 


Under the hypothesis o; = 0 he calculates an approximate probable error for 
d by replacing y.. in (2.3) by u (so the terms in (2.3) become independent) and 
the unknown g, by ¢,. If d is less than the approximate probable error thus 
obtained he accepts o; = 0, if d is large compared with the approximate probable 
error (how large Airy does not specify, and he seems to despair of the possibility 
of a mathematical criterion), he rejects o. = 0 and estimates o, by a conventional 
constant times d. There is no attempt to correct this estimate of o, for bias due 
to o, inherent in the relation Var (y;.) = 0: + Jo: ; anyway, under his pro- 
cedure Airy’s estimate of J~’s? would be small compared with his estimate of 
a: . One wonders whether Airy used the mean deviation to measure the between- 
nights variation rather than the r.m.s. measure, as he did in (2.2), because he 
found it easier to approximate the probable error of the former. 

Chauvenet (1863, Vol. 2, Art. 163, 164), while not writing model equations 
like (2.1) with all the subscripts, nevertheless implies such models and utilizes 
the consequences, such as Var (y..) = r‘e: + Jy 'o7) from (2.1). He concludes 
from this that there is no practical advantage in increasing J beyond a certain 
point in such a case, and credits this idea to Bessel (1820, p. 166), saying that 
Bessel thought J = 5 sufficient for a certain situation. Chauvenet’s reference 
to Bessel on this specific point (J = 5) is incorrect, but the page he cites does 
contain a formula for the probable error of a sum of independent random 
variables which could be the basis for such a conclusion. Probably Bessel made 
the remark elsewhere. 

Fisher’s (1918) basic paper on population genetics, which introduces the 
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terms “variance” and “analysis of variance,” employs variance-components 
models, and they are of course behind his (1925, Sec. 40) treatment of the 
intraclass correlation coefficient. First to add to Fisher’s analysis of variance 
tables the very useful column of expected mean squares for variance-components 
models appears to have been Tippett (1931, Table XXIV; in (1929) he calculated 
but did not table them). While a mixed model is implied by Fisher’s (1935, 
Sec. 65) discussion of varietal] trials in a randomly selected set of locations, and 
by Yates’ (1935a) analysis of the split-plot design, the first explicit model 
equation the writer has found for this case is in a paper on mental tests by 
Jackson (1939), where the score of the jth individual on the 7th trial of a test 
is assumed to have the structure (1.11), with the subscript k suppressed, the 
trial effects being treated as fixed and the “individual” effects as random. 
Interaction effects which are clearly labeled as random effects were introduced 
into the variance components model equation (1.12) by Crump (1946). The 
terminology of ‘““Model I” and ‘Model II’ is due to Eisenhart (1947). Basic 
work of Tukey will be discussed in Section 3, and of Fisher and Neyman in 
Section 4. In textbooks, alternative models for the analysis of variance were 
introduced by Mood (1950) and developed at length by Kempthorne (1952), 
and Anderson and Bancroft (1952). 


3. Treatment of random interactions. The examples of fixed-effects models, 
mixed models, and random-effects models given at the end of Section 1 all refer 
to an I X J two-way layout with K(K = 1) observations per cell, where y:z 
denotes the kth observation in the 7, j-cell. Let us define the usual sums of 
squares, namely, those for rows (or A), 


(3.1) SS, = JKDi (ys. — y.. 


for columns (or B), 
(3.2) SSs = IK; (ys. — vy... 
for interactions (or A X B), 


(3.3) SS.s = K>: >i (Yaz. ral. “tase + y...)s 


for error, 


(3.4) SS, = > > Pus (Yair “- yas)» 


and a “pooled”’ error sum of squares, 


(3.5) SSye = SSaps + SS, ° 


As long as interactions are omitted from the model equations, no great dif- 
ferences appear among the three models formulated in connection with (1.8), 
(1.11), and (1.12). The expected values of the above sums of squares in the 
three models may actually be expressed by the same formulas by the usual 
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device of suitably defining the symbols o, and os : When the levels of factor A 
have random main effects {a;}, as in (1.12), we define 

(3.6) o, = Var (a,); 

likewise 

(3.7) on = Var (bj) 

in (1.11) and (1.12). When the levels of factor A have fixed main effects {a;}, 


as in (1.8) and (1.11), we define o, to be the following function of the fixed 
effects: 


(3.8) o, = (I — 1)" Diva; 
likewise 

(3.9) os = (J — 1)° 2058; 

in (1.8). Then for all three models 

(3.10) E(MS,) = JKo, + 02, 

(3.11) E(MSs) = [Kos + «7%, 

(3.12) E(MSas) = E(MS,) = E(MS,.) = «2, 


where the mean square MS, is defined as the corresponding SS, in (3.1) to 
(3.5) divided by the number of d.f. (degrees of freedom), namely, J — 1, J — 1, 
(I — 1)(J — 1), JJ(K — 1), JK — I — J + 1, for x = A, B, AB, e, pe, 


respectively. (In statements involving MS, it is assumed that K > 1.) 

If we add the normality assumption (namely, that all random effects are 
normally distributed) we get exact F-tests of the hypotheses H,:o04 = 0 and 
Hs:0% = 0 by employing in the usual way the statistics MS,/MS,. and 
MS2/MS,,.. The power functions however are quite different under the three 
models, involving, for example, only central F-distributions under the random- 
effects model and only noncentral (this term includes ‘“‘central”) F-distributions, 
not central under the alternatives, for the fixed-effects model. These results are 
all earily verified by substituting y;; from the model equations into (3.1) to 
(3.5), simplifying, and then applying well-known ‘linear hypothesis” theory. 
When an F-test rejects, one would usually proceed differently in the three 
models: thus, if H, is rejected one might use a multiple comparison method on 
the {a;} if they are fixed effects, and an interval estimate of 4 (an exact solution 
is not at present available) or of o’,/o7 if the {a;} are random. 

The consequences on statistical inference are much more divergent when 
interaction terms are included in the three models. For the fixed-effects model 
we denote the interactions by {y;;}, so that the model equation becomes 


(3.13) Yi = HB + ay + B; + Vii + Cijk y 


5 Similar definitions with the denominators increased by unity were used by Daniels 
(1939). 
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where the {a:}, {8;}, and {y,;} satisfy (1.6) with p, qg replaced by 7, 7. The 
expected mean squares for this model are shown in column (i) of Table 1, where 


oan is defined to be the following function of the fixed interactions, 


(3.14) one = (I — 1) — 12)° Dd: D735. 

This is under the assumption that the {e;;} are independently distributed with 
zero means and equal variances o; ; if we add the normality assumption we get 
the well-known theory of estimation and testing for this model; in particular, 
MS,, MSs, and MS,z are all tested against WS, . 

For the mixed model it seems inescapable (also for the random-effects model 
to regard the interaction between the jth level of the column factor and the 
ith level of the row factor as a random effect, since the jth level of the column 
factor is chosen at random from a population of levels. Let us denote it by 
c;;, 80 that we have the model equation 


(3.15) Yin = wt a + D5 + C55 + Cin 


for the mixed mode). (Similarly we write the model equation 
(3.16) Yiik = u+a; t+ 5b; + Ci TF Cie 


for the random-effects model.) What should we assume about the distribution 
of the random variables {c;;}? The easiest thing is to assume them independently 
and identically distributed, with zero means, and independent of the {e;j«} 
and the {b;} (and of the {a,;} in the random effects model). But the assumption 
that the {c;;} are independent of the {b;} (and of the {a;}) is hard to accept 
Thus in the above example of the mixed model for machines and workers, it is 
not unreasonable to assume that the J workers are chosen independently from 
a population; but then to assume that after a certain worker is included in the 
experiment, the interaction between him and any one of the J machines should 
be independent of the worker (or at least of the worker’s main effect) and of the 
machine seems to violate the very notion of interaction between worker and 
machine. (A similar objection would apply in the random-effects model.) 
Under this easy but unrealistic assumption on the {e,;} the expected mean 


TABLE 1 


Expected values of mean squares 


Expected value in 


Mean square ( | (ii) (iii) 


2 : Mixed or random-effects model Mixed model with dependent 
Fixed-effects model with independent interactions interactions 


JKo’, o- 


JKoy + Koyg +o? JKo'y + Ko'sp 
Ko, + on [Ko + Ko", a toe [Ko, + 0 
2 


e 


9 


r 3 j 2 > 3 r 2 2 
Kop + Ge Kon + Ge Koarn + oc 
2 3 : 


Ce Ce Se 
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squares turn out to be those listed in column (ii) of Table 1; here o4, denotes 
Var (c;;). The table suggests that in the mixed or random-effects models, unlike 
the fixed-effects model, the mean squares for main effects should be tested 
against the interaction mean square. It is easy to show these procedures (as 
well as testing MS,4, against 7 S,) give exact F-tests of the respective hypotheses 
under the normality assumption (that all random effects are jointly normal). 
Again the power in every case involves only noncentral F-distributions (all 
central for the random-effects model). 

Distribution assumptions on the random interactions {c,;;} that seem ac- 
ceptable to the writer may be reached by following a trail broken by Tukey*® 
(1949). Let us consider first the case of a finite number of machines and men, 
not all of which are going to be included in the experiment. If u,, is the “true” 
output of the gth worker on the pth machine, there is no question about how to 
define the main effects {a,}, {8,}, and interactions {y,,}; they are defined by 
equations (1.2), (1.3), and (1.4). Now let us conceive of the mixed model as a 
limiting case where the number Q of workers in the population becomes infinite 
and only a sample of J workers is included in the experiment, but all the machines 
are included, so that p = i, P = I. Then the role of y,, is played by m(i, x), 
where z labels the worker in the population, and m(7, z) is his ‘‘true’”’ output on 
the 7th machine. It will be convenient to denote by @ the population distribution 
of x, even though it does not enter the calculations directly.’ 

Clearly the analogue of the resolution (1.5) is 


(3.17) m(i,z) = uta; + b(z) + c,(z), 

where 

(3.18) p= m., .), 

(3.19) a; = mi, .) — m(., .), 

(3.20) b(z) = m(., xz) — m(., .), 

(3.21) c;(z) m(i, x) — m(i, .) — m(., 2) + m(., .), 


and where replacing 7 by a dot in m(i, x) signifies that the arithmetic average 
has been taken over 7 fori = 1, 2, --- , J, and replacing x by a dot means the 
expectation has been taken over x with respect to ®. We may call {a;} and 
b(x) the main effects in the population, and {c;(x)} the population interactions, 
and we note they satisfy the conditions 


(3.22) Dia; =0, E(b(x)) = 0, >: ¢4(z) = O(allz), E(c(x)) = O (all 2). 


6 Tukey did not publish his results in a journal and they were independently found by 
Wilk and Kempthorne, and Cornfield. 

7 The reader interested in these points will easily supply the mathematical assumptions 
under which @ is a probability distribution on a probability space of points z, the J functions 
m(i, x) are random variables with finite variance, and also the appropriate assumptions on 
the product space of points (y, z) for the random effects model below 
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The random variables b(x), c:(x), --- , cr(x) are not independent; their variances 
and covariances are functions of the covariance matrix (¢;;) of the J random 
variables {m(i, x)}. If 

(3.23) oi = Cov (m(t, x), m(z’, x)), 


then 


(3.24) Var (b(x)) =o 


“* 9 


(3.25) Cov (c,(x), Cy (x)) — Cp OE CP: + C...9 


(3.26) Cov (b(x), c(x)) = a4. — o... 


Suppose now a random sample of J workers is taken from the population. 
If they are labeled by 27 , --- , x, , so that the {z;} are independently distributed 
according to ®, then the “true’”’ output of the jth worker in the experiment on 
the ith machine will be m(i, x;), which we shall write m;;. We shall assume the 
observation y;;, to be of the form 


(3.27) Yin = Mij + Cijk 5 


where the set {e;;} is independent of the set {m,;}. We shall also assume the 
° ° ° 2 7 0,0 
{ein} to be independent with zero means and equal variance o,. Writing 


b; = b(x;) and cj; = c,(x;), we have from (3.27) and (3.17), 
(3.28) Yijik = + ai + b; + Cij + Cijk » 


where the {e;;,} are independent of the {b;} and {c;;}, all have zero means, and 
the J sets {b;, c1;, +--+ , C13} are independently and identically distributed like 
the random vector (b(x), ci(z), «++ , cr7(z)) whose covariance matrix is given by 
(3.24), (3.25), (3.26), the elements of the underlying covariance matrix (¢;;-) 
being regarded as unknown parameters. 

The appropriate definition of the symbols o% and o4z is suggested by starting 
from the customary definitions for the finite set {u,,}, namely, 


(3.29) os = (Q—1)").&, 
(3.30) ose = (I — 19°10 — 10° Ds Deetie; 


and going to the limit; the result is 
(3.31) o» = Var (b(z)), 
(3.32) cap = (I — 1)" Do: Var (c,(z)). 


For details and discussion of this and the other results we shall now briefly 
mention for this model, and for citations of related work, the reader is referred 
to Scheffé (1956). With these definitions the expected mean squares are those in 
column (iii) of Table 1. Under the normality assumption the tests of the hy- 
potheses os = 0 and o4s = O suggested by the table, namely, those based on 
MS:3/MS, and MS,s/MS8,, turn out to be exact F-tests. However under the 
hypothesis o% = 0 (i.e., a1 = a2 = +++ = a; = 0), the statistic MS./MSus 
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suggested by the table does not have an F-distribution; an exact test of this 
hypothesis and an associated multiple-comparison method may be based on 
Hotelling’s 7° statistic. 

Appropriate distribution assumptions about the interactions in the random- 
effects model may be motivated in a similar way. Let the machines be labeled 
by y with population distribution Q. We assume y and zx independent, corre- 
sponding to independent choices of machine and worker. Let the random 
variable m(y, x) be the “‘true’’ output of the worker labeled x on the machine 
labeled y. Define the general mean, main effects for machines, main effects for 
workers, and interactions, all in the population, respectively by 


(3.33) p= m(., .), 
(3.34) a(y) = m(y, .) — m(., .), 
(3.35) b(x) m(., x) — m(., -), 


(3.36) c(y, x) = m(y, x) — m(y, .) — m(., 2) + m(., .) 


’ 


where replacing x or y by a dot in m(y, x) indicates the expected value has been 
taken over z or y with respect to @ or Q, respectively. The analogue of (1.5) 
is now 


(3.37) m(y, 2) = uw + aly) + B(x) + ely, 2). 


If the J machines randomly selected for the experiment are labeled by 
Yi, °** , Yr, the J workers by 2, --+ , z,, so that the {z;} and {y;} are inde- 
pendently distributed according to ® and Q, respectively, then the ‘‘true” 
output of the jth worker on the ith machine in the experiment will be m(y; , xj). 
Write nj = my; i 2;), a a(y;), b; = b(z ), Cas c(y; - x;). Then assuming 


as in (3.27) that the errors {e;,} are independent of the {m;;} and are inde 


pendently distributed with zero means and common variance o, , we get the 
model equation 


(3.38) Yas uta; + by + ci + Cie, 


ij 


and may verify that all the random effects have zero means, the a, --- , ar, 
bi, --- , by are independent, and the {c;;} are uncorrelated with each other and 
with the {a;} and {b,;}. The expected mean squares are then those in column 
(ii) of Table 1. If we now add the assumption that all the random effects {a;}, 
{b;}, {e:;} are jointly normal, this forces the {c;;} to be independent of the 
‘a;} and {b;} as well as among themselves, and we are back to the random- 
effects model discussed earlier—with perhaps a iittle less aversion to the inde- 
pendence assumptions on the interactions—or a little more suspicion about the 
innocuousness of the normality assumption! 

This approach to the random-effects model is easily extended to more factors, 
and it is again found that all the random effects in the model equation are un- 
correlated. The extension along the above lines (involving 7°) of the mixed 





ANALYSIS OF VARIANCE 26: 


model to more factors is in progress; expected mean squares have already been 
given by several writers, including Tukey (1949), Wilk and Kempthorne (1953- 
1955), and Bennett and Franklin (1954). 

In applying models like those discussed in this section one cannot evade the 
question as to the effect of the inevitable violations of some of the following 
three assumptions made on the errors: (i) independence, (ii) equal variance, 
and (iii) normality. Of these, (i) is the most difficult to discuss. The effects on 
the F-tests of certain kinds of correlation in the fixed-effects model have been 
studied by Daniels (1938) for the one-way layout and by Box (1954) for the two- 
way layout, and we merely mention that they can be serious. The effect on point 
estimation and tests, of correlation of errors due to the random assignment of 
treatment combinations to finite populations of experimental units will be 
treated in Section 4. That violation of (ii) should not seriously affect the F-tests 
in the case of balanced designs is suggested by approximations by Daniels 
(1938) and Horsnell (1953), and exact small-sample calculations by Box (1954). 
Such insensitivity of the F-tests to variance heterogeneity would then carry 
over to the multiple-comparisons methods associated with the F-test (Scheffé 
(1953)), although single inferences using the ¢-distribution and based on the 
assumption of variance homogeneity could be extremely misleading. As long as 
we limit ourselves to calculating expected mean squares, violation of (iii) is of 
course of no effect. Work of many writers,’ including E. S. Pearson (1931), 
Box (1953), and Box and Andersen (1954) leads to the generalization that non- 
normality should have little effect on the validity of inferences about fixed 
effects but may play havoc with inferences about random effects. There is 
again the comforting consideration that multiple-comparison methods associated 
with the F or T° tests (Scheffé 1956)) should share with these tests their in- 
sensitivity to violation of (iii). 

Tukey’s (1949, 1951) and Wilk and Kempthorne’s (1953-1955) extensive 
work on alternative models is in a more general form than we have considered, 
in that, if a factor appears in the experiment at n levels, the levels are treated 
as a random sample (without replacement) from a finite population of N levels. 
This would be of interest, for example, in the illustration of machines and men 
if it were desired to make inferences about a finite population of N machines 
in the factory but it is feasible to include only a sample of n of them in the 
experiment. The above treatment of a factor with random main effects is then 
included as a limiting case for N — ©. Tukey regards the case of a factor with 
fixed main effects as that where n = N. This imposes a certain symmetry on the 
levels which usually does not correspond to the situation where the model is 
applied, and was not assumed in the above treatment of the mixed model 
associated with (3.28), but which does not affect the expected values of the 


* Extensive moment calculations designed to assess the effects of violations of (ii) and 
iii) on F-tests have been published by David and Johnson (1951a, 1951b, 1951c, 1952) but 
the numerical tables promised by them have not yet appeared, except for Horsnell’s (1953) 
paper which utilizes their calculations. 
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usual mean squares, which are obviously symmetric in the levels (for further 
discussion, see Scheffé (1956)). The rules for calculating expected mean squares 
for models where the levels are sampled from finite populations may be found 
also in the book by Bennett and Franklin (1954, p. 414). Most of Tukey’s 
(1951) work is concerned with moments, higher than the first, of the mean 
squares and estimated variance components; in this normality is not assumed, 
but the various sets (for each combination of factors) of random interactions 
are assumed independent among themselves, of each other, and of the main 
effects. Some results on moments have been obtained by Hooke (1954a, 1954b) 
for Tukey’s (1949) more realistic formulation of interactions in which the inter- 
action between the levels of two factors depends on the levels obtained in the 
sampling. 


4. Randomization models. An important case of the two-way layout is the 
randomized blocks design. To this the example of machines and workers we have 
carried along thus far is not adapted, and we consider another: Suppose / 
treatments of a crop are compared on J blocks of J plots each. The essential 
feature of the design is that in each block the J treatments are assigned to the 
I plots at random by use of a table of random numbers, or coin tossing, or urn 
drawing, etc., the randomizations in the J blocks being independent. For this 
design the following model was formulated by Neyman (1935, pp. 110-112, 
145-150): In each block number the plots with 7’ 1,2, --- , J. Let pss be 
the ‘‘true” yield under the ith treatment on the j, 7’ plot (z’th plot of the jth 
block); this conceptual quantity is regarded as the expected value if the ith 
treatment were applied in the j, 7’ plot. In a thought experiment involving a 
sequence of repetitions under the same conditions, the observed yields of the 
ith treatment in the 7, 7’ plot would differ from y;; on any particular trial by a 
technical error e;; ; this is regarded as a random variable and, by definition of 
wij, E(eiji) 0. The randomization by which the treatments are assigned to 
the plots is independent of the set {e,;;}. We may write? 


(4.1) Mijt = Uw + ai + B; + Vij + €iji’ 5 


where the general mean y, treatment main effects {a;}, block main effects {8;}, 
and treatment-block interactions {y,;} are defined in terms of the {y,;;.} as in 
(1.1) to (1.4), where u,, is replaced by y,;., and 


(4.2) Cis = Mise — Bij 


We note the {e¢;;} satisfy er é:j7 = 0 for all 2, 7. The e;; is the unit (plot) 
effect of the j, 7’ plot specific to the ith treatment, and within the jth block. If 
yi; denotes the observation on the 7th treatment in the jth block in the experi- 
ment, then 


(4.3) Vii = Bb T a; + B; + Vis os é5; + Cis, 


® Neyman’s notation has been modified to that of this paper; he used a single term 
X..(k) for our »p + a; , another B;(k), for our Bj + yi , and his u,;(k) is our &;<’ 
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where @;; and e,; are respectively the ¢;;; and e;;; for which 7’ t’(i, 7) is the 
plot number to which the 7th treatment got assigned in the jth block. Neyman 
called the {é;;} the ‘soil errors’; we shall call them the unit errors. Employing 
a convenient notation of Kempthorne (1952), the unit error é;; and technical 
error e;; may be written 


(4.4) é:; = Doe Giveszy , Cag = Doe diverse ’ 


where the {e;;,| are regarded as unknown constants and the {dj;} are [’J 
random variables taking on only the values 0 and 1. Their joint distribution, 
determined by the randomization scheme described above for assigning the 
treatments to the plots, is evidently the following: For different 7 the J sets of 
I’ variables {d’,} are independent. For fixed j, think of the set {di} arranged 
in an J X TJ square with dj, in the ith row and 7’th column; then the possible 
values for the set are the J! ones in which there is exactly one 1 in each row and 
column and 0’s elsewhere, and these J! values are taken on with equal proba- 
bility. The {d?,,} are independent of the {e;;;}; because of this and E(d?,-) 1/I 
it follows from (4.4) that 


’ 


(4.5) U8; ; E(e;;) = 0. 


Neyman showed that an unbiased estimate of any treatment difference a; — a 


is ¥;. — Yur, ; more generally, the same is true for the estimate 


(4.6) ¢g > iAY 


of any contrast ¢ >> Nail os \; = 0), from (4.3), (4.5), and (1.6). 

Denote the sums of squares for treatments, for blocks, and for interactions 
(usually called “for error”) by SS4, SSg, and SS4s; they are defined by 
(3.1), (3.2), and (3.3) with K 1 and the subscript k deleted. The calculation 
of the expected values of the corresponding mean squares MS,, MSz, and 
M Sz is greatly simplified if we can assume the set of technical errors {e;;} to 
be independently and identically distributed, and independently of the set of 
unit errors {é;;}. A sufficient condition for this is that the set {e,,;} be inde- 
pendently and identically distributed, and this we shall assume—until we 
discuss tests below. 

Neyman calculated” E(MS,) and E(MS,s) under some further simplifying 
assumptions about the {«;;}. The general values without further assumptions 
follow from formulas first given by Kempthorne (1952, p. 148; the technical 
errors’ are assumed negligible there). Resolve the unit effect ei; = wise — mi; 
of the j, 7’ plot specific to the ith treatment into a unit main effect within the 
jth block, 


(4.7) Ei = B.je — Bg.» 


10 Professor O. Kempthorne pointed out to me a slip in Neyman’s calculation; its effect 
: : 2 "eee. _) 
is the loss of the term o4, from E(MS, 3). 

1! Technical errors are included in randomization models by Wilk (1955b). 
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plus a treatment-unit interaction within the jth block, 


(4.8) Niji) = Mijiv’ — B.jit? — Bij. + M.j. 5 


and define the symbols oy and o4v (corresponding to a “units factor” and to its 
interaction with the “treatments factor’’) as 


(4.9) oe = J 1 — 10°; Sv Se, 
(4.10) ono = JU — 19° >: Ds Doe thse 
Then with o4 , o#, ove defined by (3.8), (3.9), (3.14), and 
(4.11) o. = Var (e;;), 


the desired formulas are 


” 


(4.12) E(MS,) = Jog tou + I'l — 2)oau +o, 
(4.13) E(MSs) = Ion + T'(I — 1)oau + @, 
(4.14) E(MS.s) = o12 + on + T'(l — 2)enu + o2. 


It is easy to derive an expression for the variance of the estimated contrast 
(4.6) in terms of the unknown parameters | e;;;} and o. , but there exists no 
unbiased estimate of it. An overestimate can however be obtained by estimating 
the contrast separately for each block by 


(4.15) ¢ Dilys 


and using the sample variance of these J estimates, 
(4.16) s = (J — 13°); 6; — oy. 


From (4.3), ¢; = ¢ + 6; + u;, where 6; = Po AWi;, and u; = Dal; + ¢€;;), 
the u; having zero means, and being independent since they are calculated from 
different blocks. Since ¢ = ¢., therefore Var (¢) = J?*>; Var (u;). Then 
s'/J is an overestimate of Var (%) in the sense that E(s’/J) = Var (¢), since 
E(s'/J) = J'(J — 1) a 6; + Var (3). Clearly s°/J is an unbiased estimate 
if cin = 0. 

The problem of statistical tests under the randomization model associated 
with (4.3) is complicated. Let us call normal-theory model that in which the 
terms {e,;;} in (4.3) are independently normally distributed with zero means 
and equal variance o, , while the é;; are always zero (which is equivalent to all 
€ijv = 0). The usual F-test for treatment effects is then a test of the hypothesis 
o, = o23s = O in the normal-theory model. Its power is usually considered 


against alternatives with a4, = 0, in which case the power can be expressed in 


terms of the noncentral F-distribution. The randomization model seems very far 
removed from the normal-theory model if the unit effects {€;;} are not small 
compared with the o; characterizing the technical errors. Nevertheless, following 
Fisher (1935, Sec. 21) it has become a common belief among statisticians that 
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referring the usual F-statistic (in the above case, MS,/MS,4.) to the tables 
valid under the normal-theory model gives a good approximation in some sense 
to the exact test under the randomization model, which we shall describe in a 
moment and shall call the permutation test based on the F-statistic. The writer 
has had difficulty in trying to formulate clearly the sense in which this approxi- 
mation is expected to hold. 

The null hypothesis for the permutation test specifies that there is no dif- 
ference whatever among the treatments, that is, under the above randomization 
model that o% tin as Q and the joint distribution of the JJ random 
variables jej;} where 7’ i'(z, 7) depends on the assignment of treatments to 
plots, is the same for every one of the (J!) assignments. This is equivalent to 
the hypothesis that the joint distribution of the JJ observations under any of 
the assignments is the same for all the assignments. 

The permutation test based on the F-statistic is made as follows: Consider 
the group G of all m permutations of the observations which leaves their distri- 
bution invariant under the null hypothesis (in the above case G consists of the 
m ([!)’ permutations obtained by making all possible J! permutations 
within each block). If these m permutations are made on the observations actually 


obtained in the experiment and the F-statistic calculated as though the permuted 


observations had been obtained, a set of m (in general not distinct) values of F 
will be generated. The idea is to reject the hypothesis at the a level of significance 
if the value of F actually obtained lies among the am largest of the m values 
some obvious qualifications have to be made because am may not be an integer 
and there is trouble about the m values not being distinct (for further details see 
Scheffé (1943) or Hoeffding (1952)). For any fixed set of observations there is 
thus determined a “significance level’? F, for the statistic F so that we reject 
if F > F,, but F, is a random variable depending on the outcome of the ex- 
periment,and we write F, = F.(y) to indicate this. In most of the potential appli- 
cations of the permutation test, the value of F,.(y) is extremely tedious to calcu- 
late. The evidence that the usual F-test approximates the permutation test is of 
three kinds: 

First, numerical examples have been published where for particular sets of 
observations it transpires that F.(y) is close to the value in the F-tables; see 
for exampie Eden and Yates” (1933), Fisher (1935, Sec. 21), Welch (1937, 
p. 31), Pitman (1938, p. 334), and Kempthorne (1952, p. 132). 

Second, there are moment calculations, up to fourth-order moments, made 
on a transform of the F-statistic which has the incomplete beta distribution 
under the hypothesis in the normal-theory model. These were made for ran- 


'2 Their results can be regarded as a comparison with values in the F-tables of estimates 
of F..(y) obtained by empirical sampling of the permutation distribution of the F-statistic, 
for various levels a and a single set of ‘‘observations”’ y (not the actual observations but 
iverages of sets of 8 observations in a uniformity trial in randomized blocks; also, they 
ise } log F instead of F. This paper is clarified by a discussion between Yates (1935b, 
pp. 164, 165) and Neyman. 
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domized blocks by Welch (1937) and Pitman (1938), who worked under the 
above model with the restriction that the technical errors were assumed iden- 
tically zero. It is easy to remove this restriction by a conditional probability 
argument about the probability of rejecting the hypothesis when true. A measure 


9 


of the magnitude of the unit effects {£;} in the jth block is oz,; 
(I — 1)">°y &. Pitman’s calculations on the first four moments indicate 
that if the {o?, ;} do not differ too widely for the J blocks, then F.(y) should 
be close to the value in the F-tables. This is for the case of zero technical errors: 
under this assumption and the null hypothesis all the parameters can be calcu- 
lated exactly from the observations, and in particular o;, ; becomes 


(4.17) I — 17D Wis — ys.) 


If we do not assume zero technical errors, then the condition of not too great 
difference of the fou. 5} characterizing the blocks is replaced by the same con- 
dition on the functions (4.17) of the observations. Welch (1937) and Pitman 
(1938) also made moment calculations for the Latin square in the same papers; 
the calculation for the first moment had been published earlier by Yates (1933). 
The approximation of the F-test to the permutation test for randomized blocks 
xan be improved by adjusting as follows the numbers of d.f. with which the 
F-tables are entered: The first moment of the above-mentioned transform of the 
F-statistic in its permutation distribution does not depend on the observations 
and is the same as under the normal-theory model and the null hypothesis; the 
second moment is determined by the quantities (4.17). If, as suggested by 
Welch and Pitman, we choose the numbers of d.f. of an approximating in- 
complete beta distribution to give the correct first two moments, this is equivalent 
to referring the F-statistic to the F-tables with the same numbers of d.f. This 
correction to the numbers of d.f. is given by Box and Andersen (1954), as well 
as a similar correction for the one-way layout. 

Third, there are some asymptotic calculations. As the number J of blocks 
increases with the number J of treatments fixed, the limiting distribution of the 
F-statistic under the normal-theory model is chi-square with J — 1 d.f. Wald 
and Wolfowitz (1944) showed that as J increases with fixed J, if the sequence 
of observations satisfies certain restrictions, then the permutation distribution 
of the F-statistic has the same limiting form. Hoeffding (1952) proved that as 
J increases with fixed J, then under certain assumptions on the sequence of 
distributions of the observations, the random variable ‘“‘significance level’’ 
F.(y) of the permutation test approaches a constant in probability. With this 


he was able to show that the permutation test had in a certain sense asymp- 
totically the same power as the usual F-test against alternatives of the normal- 
theory model. Of course, what we would like to know more about is the power 
of the usual F-test against the alternatives allowed by the randomization model. 
An asymptotic calculation similar to Wald and Wolfowitz’s just mentioned was 
carried out for the one-way layout by Silvey (1954). 


Randomization models have been formulated and expected mean squares 
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-alculated for many other designs by Kempthorne (1952, 1955), Wilk (1955a, 
1955b) and Wilk and Kempthorne (1953-1955, 1955). Fisher (1926) was the 
pioneer in emphasizing the importance of randomization and in conceiving 
of permutation tests (1925, Sec. 24; 1935, Sec. 21). In introducing randomized 
blocks Fisher (1926) did not formulate explicitly a model like Neyman’s, above, 
and one might infer he had in mind a more restricted one, in particular with 
oan = 0, since he claims that “One way of making sure that a valid estimate 
of error will be obtained is to arrange the plots deliberately at random... .” 
The mathematical model for the completely randomized experiment was given 
by Neyman (1923) under the restriction of zero technical errors. 

The writer has had the benefit of solicited comments on this paper from the 
following persons, and has improved the paper by incorporating many of their 
suggestions, for which he is very grateful: Professors William G. Cochran, Oscar 
Kempthorne, William Kruskal, H. Fairfield Smith, John W. Tukey, and Martin 
B. Wilk. 
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THE THEORY OF DECISION PROCEDURES FOR DISTRIBUTIONS 
WITH MONOTONE LIKELIHOOD RATIO 


By SAMUEL KARLIN AND HERMAN RUBIN 
Stanford University 


0. Introduction and Summary. In many statistical decision problems, the 
observations can be summarized in a single sufficient statistic such that the 


likelihood ratio for any two distributions in the family under consideration is a 


monotone function of that statistic. This paper assumes, accordingly, that the 
statistician’s decision is to be based upon a single observation of a random 
variable X, whose distribution is given by (1) and satisfies the inequality (2) 
in Section 1. As examples of this family of distributions, we have the exponential 
family such as the normal, binomial, and Poisson. Other kinds of examples are 
given in Section 1. 

In connection with the ordinary testing problem, Allen [1] showed that for 
the composite testing problem of the one-sided type for the special case of the 
exponential family of distributions, an admissible minimax procedure must be 
of the form: choose action 1 (accept the hypothesis) if x < 2x9 and choose action 
2 (accept the alternative) if x > a». If x = 2», randomization may be required. 
Sobel [2] and Chernoff obtained partial results for the same class of distribu- 
tions when the set of decisions is finite. 

This paper unifies, extends, and strengthens these results and treats of a wide 
variety of stati stical decision problems for which the densities have a monotone 
likelihood ratio’ 

In Section 1 the fundamental definition and preliminaries are introduced. 
In particular, the conditions imposed on the loss functions and the densities 
are delimited and some simple properties of these quantities are developed. 
In Section 2 we establish some of the basic lemmas. Noteworthy are Lemmas 
1 and 2 which express the variation of sign diminishing properties of the densities 
which possess a monotone likelihood ratio. 

The essential completeness of the set of all monotone strategies (see Section 
3 for the definition) in the class of all statistical procedures is demonstrated in 
Section 3 for the case of a finite number of actions. Section 4 deals with the 
problem of determining the form of all Bayes strategies for the statistician. 
The important problem of admissibility is studied in detail in Section 5. In 
the next section a study of the Bayes strategies for nature is made for the case 
of two actions. In Section 7 the complete class theory is carried through for the 
case of an infinite number of actions. This is accomplished by employing an 
argument involving a limiting procedure from the case of finite actions as 
treated in Section 3. The eighth section presents an analysis of the nature of 
the Bayes strategies for the case of an infinite number of actions. The final 
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section entails a brief discussion of the connection of invariance theory and the 
conditions of monotonicity as are required throughout this paper. 

Further extensions of these ideas in a different direction, which involves 
relaxing the conditions on the loss functions and strengthening the requirements 
on the densities, can be found in [3]. 


1. Definitions and preliminaries for the case of a finite number of actions. 
Let the observed random variable, usually a sufficient statistic, be denoted 
by x and the unknown state of nature by w. It is supposed that both variables 
traverse subsets of the real line, X and Q, respectively. The set 2 can be taken 
as ar interval, without loss of generality, as will be shown in Lemma 1 below. 
Let ine cumulative distribution, when the true state of nature is described by 
the parameter value w, have the form 


(1) P(z|w) = / p(t| w) du(t) (u is a o-finite measure on X), 


where if xz; > 22 and a, > w:, then the density function satisfies 


(2) p(x1 | w:)p(x2 | we) — pla | we)p(re | wi) 2 0. 


Without loss of generality, the spectrum of » is assumed to be all of X. 

Any distribution of the form (1) which satisfies (2) will be said to possess 
a monotone likelihood ratio (M.L.R.). Throughout this paper we shall be 
concerned only with such distributions. The most noteworthy such class of 
distributions consists of the exponential family of distributions, e.g., 


p(x | w) = B(w)e*”. 
Then, 
p(x | wi) p (x2 | We) — p(x | we )p(X2 | w) = B(w1)B (we) fe? 7? “08 - _ 1]-e7"*? sit 


which is positive if 27; > z2 and w, > we. A more general class of distributions 
for which (1) and (2) hold is given in [5]. This class includes as special cases 
the noncentral ¢ and noncentral F densities. Other examples of considerable 
interest, which occur in many practical situations and possess a M.L.R., are 
as follows: du(x) = dz, 


(na”* / w”, 0<2z<w, w>O, na fixed positive integer, 
p(x|w) = 4 


| 
\0, elsewhere, 
n—2 


, x ° 
| n(n — 1)—(w-—2) 0<2<w, w>O nan integer => 2 
d w” 

\ 


p(x|w) = 


0, elsewhere, 


p(x, w) we. x —-co2 Cw co, 

{ —(z—w) 
le ; z W, 
p(x,w) = 4 


\0, x @. 
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This last distribution is known as the exponential, or waiting time, distribution 
and occurs in some models of life-testing experiments. 

The first few sections treat the statistical decision problem where the statistician 
has only a finite number n of actions. Let the loss function corresponding to the 
ith action be denoted by L,(w) when the true parameter value is w. The following 
requirements are imposed upon J, : 

A. The L;(i = 1, --- , n) are defined throughout Q. 

B. The number of changes of sign of L; — L;4; is at most one for 7 L, 

-,n — 1. (A point wp is called a change point for a function A if in some 
neighborhood of wo , 


h(w)h(w*) < O, 


whenever w S wo S w*, and for some w S wo S wi, h(w:) ¥ O and h(w:) + O 
with w; ~ wr.) The number N(h) of changes of sign of the function h is the 
supm N(h(w;)), 7 = 1, ---, m, where N(h(w,)) is the number of changes of 
sign of the sequence h(w;), h(w.), --- , hA(wn) with w; < w;4; and otherwise arbi- 
trary. 

C. Let S; denote the set of w in 2 where L;(w) min; L;(w). We assume 
for each S; that S; < S; fori < 7, where S < 7 means that the part of S not 
in T lies to the left of 7. 

D. Let each function L; — Lj4; have precisely one change point, which we 
denote by w;. 

Let us define the spectrum of p by 


Cw {x | p(x | w) > O}. 


Then we may observe 

Lemma A. [fz <y < zandz € o0,,Y &04,, and y € au, , then ow, < w and 
2 £ 04, and similarly with “>” for “<”’. 

Proor. Note that p(x | w:)p(y | w2) > 0 = p(x | we)p(y | a), and since z < y, 
we must have w; < w,. AlsoO S p(z| ws)p(y|w1) — ply | w2)p(z | w:), but this 
can happen only if p(z | w:) = 0. A similar argument holds for the opposite sign. 

Coro.uary. The set o, for any w is a relative interval in X, i.e., the intersection 
of an interval and X. 

A direct consequence of Lemma A is 

Lemma B. If w, < we, then ou, < ou, 

DEFINITION. © is called statistically connected if whenever w < w’ there 
is a sequence w = wo < w < ++: < w, = w’ such that 


P(o4;,,| i) > 0 fort? = 0,1, ---,n — 1. 


We see that if Q is not statistically connected, it can be decomposed into 
statistically connected sets Q,.. If Xa = Usa, o., then P(X,.|w) is zero if 
w gQ, and one if w ¢ Q2,. Thus, for statistical purposes, we may as well deal 
with each Q, separately, since from any observation x we can recognize in 
which X, it lies and hence in which 2, our unknown parameter value lies. It 
is also clear from this that, without loss of generality, 2. can be considered an 
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interval. Throughout the remainder of this paper we assume that @ is sta- 
tistically connected. 

LemMa C. If w < w’ and P(c.|w) > O, then there exists a constant K such 
that for all x and for all 0,w © 0 S w’, 


p(x | 6) S K{p(x| w) + p(x | w’)). 


ProoF. Since (by the corollary to Lemma A) o, and o,, are relative intervals, 
the hypothesis yields the existence of an interval J such that ¢,n ow = In X. 
Moreover, as P(o, n a |w) > O, there exists an x and a y > x (fixed from 
now on) in ¢4 N ow such that p(x |w), p(x | w’), p(y |w), and p(y | w’) are all 
positive. Let A (—»,2), B = [z, y], C = (y, ©) and letw < 6 < wo’. If zis 
in A, then 


y< p(z | w)p(z | 6) 


(a) p(z|6 = c(6)p(z| w), 


p(x | w) 
and if z in C, 


pz | o')p(y |@) _ 


(b (z|a< 
) p(z|@) < oty | a’) 


d(6)p(z | w’). 
Also if z is in B, 


| te ladete 
(c) p(z|6) =? ere 1®) _ 6()ple| w) 


and 


p(z | w')ply | 8) 
p(y | ’) 
We obtain from (c) 1 2 P(B| 6) = c(@)p(B\w) or c(@) S 1/ P(B\w) for all 
w < 6 S w’. Similarly, from (d) it can be inferred that d(6) <= 1/ P(B |’). 

Now if zisin B 


(d) p(z|@) = = d(0)p(z| w’). 


(z| w)p(y |) — 


-}) PYI® ply|o’) 
p(y|w)  ~ le |«) 


Dp 
(z 6) s = , 
pte P(y|w’) ply| w) 


< ~ <= 
< d(@)ap(z|w) PB\o 
where a p(y | w’) / p(y | w). Finally, (a) and (b) become for z in A 


1 
(f) p(z|0) < PB 1) p(z|w), 


and for zin C 


1 ’ 
(g (z $= 2 . 
g) p(z|@) s PB|@) p(z|w) 


The three inequalities (e), (f), and (g) readily imply the result of our lemma. 
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Coro.iary. If © is statistically connected and 6; , 62 are in 2 with 6; < 62, 
then there exist w , --- , w, in 2 and a constant K such that for all z and all 
,< 0 < % 


p(z|@) < K > plz} ,). 
t=] 
This result will be used in section 6. 
2. Fundamental lemmas. The following lemma is a fundamental tool to be 


used extensively throughout the sequel. 
Lemma 1. If h changes sign at most once, then for F a measure, 


g(x) = | v@ | w)h(w) dF(w) 


changes sign at most once. 
Proor. Let wo denote a change point for h. Suppose for definiteness that 
h(w) S O for w S w and h(w) = 0 for w > uw. Define hi(w) h(w) for w > wo 


and hi(w) = 0 for w S wo and ho(w) = hi(w) — h(w). Let 


g(x) = | p(x | w)hi(w) dF(w). 


Clearly, 


Consider for x; > 2» 


gi(xi)go(r2) — go(ar)gi (x2) 
(3) 


wo 2 
= [ / [p(a1 | w)p(x2| 0) — plai| 0)p(xe| w)hi(w)he(0) dF(w) dF(6) = 0 


on account of (2). As a consequence of (3), we cannot have g(x.) > 0 while 
g(a1) < 0. Otherwise, 0 S gi(x1) < go(a1) and O S ge(x2) < gi(axe). These last 
two inequalities lead to an obvious contradiction of (3). Let zo be the supremum 
of the set of all z* such that g(x) < O forz S z*(—*% S xm < ~). In view of 
the facts established above, we find that g(x) < 0 for x < 2a and g(x) = 0 
for x > 29. This clearly implies that g changes sign at most once Q.E.D. 

REMARK 1. It is useful to note that g changes sign in the same direction as 
h if it changes sign at all. 

From now on, unless stated to the contrary, when a function changes sign, 
then it will be assumed that the function changes from nonpositive values to 
nonnegative values as the independent variable increases. 

A careful study of the proof involved in Lemma 1 also shows 

Corouuary 1. Jf g(xo) = 0, but gi(x0) go(to) > O, then g(x) = 0 for x = xo 
and g(x) = Oforx S x%. 

In an analogous manner by defining ¢; and ¢» from ¢, we obtain 
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Lemma 2. If ¢ changes sign at most once in X, then 


Y(w) = p(x | w(x) du(z) 


changes sign at most once. Moreover, if ¥(wo) = 0 while 


where ¥(w) = Yfulw) — yo(w), theny(w) = 0 for w = wand yw) S Oforw S w. 
CoROLLARY 2. Assuming the integrals are well defined, then the functions 


qi(x) = | p(x | w)[Li(w) — 4i41(w)| dF (w) 


with F a measure, change signs at most once. Moreover, if some w in Q, where 
L; — Lis: # 0, belongs to the spectrum of F and strict inequality holds in (2), 
then g; is zero at most once. 

Since L;(w) satisfy assumptions B, C and D, an application of Lemma 1 im- 
plies the statement of this corollary. 

REMARK 2. If o. = X for all w in Q and strict inequality takes place in (2) 
for x; in X and w, in Q, then the proof of Lemma 1 shows easily that g can have 
at most one zero, provided F does not concentrate its full mass in the set of 
zeros or change points of h. A similar comment can be made concerning Lemma 2. 

Lemma 3. Let 0 < o* S 1 and Splx @)o*(x) du(x) = c(0O S c S 1). There 
exists an x) and 0 S Xo S 1 such that if 


i, 
(3a) @ (x) =<{Xo, 
0, 


the n 


P 0, for 
(4) p(x | w)[o* — ¢] du(x) < ™ 
J : 0, 
If ow X for all w, then the monotone strategy of the form (3a) satisfying (4) is 
unique except for at most one point. 
Proor. Case 1:0 < c S 1. Let 2 be a such that for 


>~ceS | p(x! @) du(z). 


Such a value clearly exists. Define \o(0 S A» S 1) so that 


(4a) / p(x | @) du(x) + Ao p(xo| &) u{ x0} 


a 


(Ao represents the amount of randomization necessary at 2 for equality.) The 

randomization is only necessary when p{2o} > 0. In the case when p{ 29} 0, 
» Se > ° < nr 

we always take Ayo = 0. Define ¢ in terms of 2 and Xo as in (3a). The number 
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of changes of sign of ¢ ¢* — @ is at most one. Furthermore, ¢ oi — d, 
where ¢; = $* for z = x and ¢; = 0 for x < 29, while @& ¢1 — ¢ is zero for 
x > xo. If fp(x | &)di(x) du(x) > 0. then by virtue of Lemma 2 the result (4) 
is confirmed. Consider now the possibility of [p(x | )@:(x) dx 0. It follows 
from Lemma B that oz X n [x , xe], where 2; < 2% < x2 and d(x) = o(x) = 0 
for 7% Sx < x. Since 


- ¢ | du(x) = 0, 


| p(x | ale 
° = 6 « 
we must have ¢*(xz) = 1 for 7, S x S a. AS ow < oc forw < ®and@ = ¢* 
when x S 2, we conclude that 


- 


| p(x | w)(¢ 


* 


¢) du(x) < 0. 


On the other hand, for w > @,¢, > oz. As¢@*(x) = 1 for a, S x S xo, we infer 
that ¢*(z) = @ (x) for x = zx, and thus 


| ve \«)(6* — 6°)@) dulz) 2 0. 


This completes the proof of Case 1. 

Case 2: c = 0. Define @ = 1 for x < 2,, =0 for x > 2, whereas - 
X n [x: , 2], while ¢'(z,;) = 0 or 1, according as x is in og or not. The condition 
c = 0 implies that the set S {2 | o*(x) > O} is disjoint from oz . Again, for 
w <0,¢4 <05,80¢% = ¢*forz Sm. Consequently, 


ec | w)(o* — ¢’) du(z) < 0 


If w > &, then o, > oc, and since ¢*(x) > ¢ (x) whenever z > 2; , we con- 
clude that 


| pix \a)or — ¢) du(x) > 0 


The proof of the last part of the lemma is obvious. 

The greater detail in the above proof is necessitated only by the zeros possible 
in (2). If the conditions of Remark 2 are satisfied, the conclusion of (4) is im- 
mediate by virtue of the construction in (4a) and Lemma 2. 


3. Essential completeness. A strategy for the statistician is a collection of 
functions ¢ = {¢;,7 = 1, --- , n} depending on the observed variable where 
¢;(x) equals the probability of taking action 7 when x was observed. Of course, 
0 Ss ¢(x) S 1 and > rat O:(2) = 1. A strategy ¢ is called monotone if there 
exists a set of numbers 2,(7; S rigs (4 = 0, --- ,n), ao = — ©, and 2,4; + « 
with 


b.( i. Zi XB << Besa, 
Zz) \ 
\0, <a OF @> ia, 
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Randomization for a monotone strategy thus can only occur at the boundary 
values x; which describe the particular monotone strategy. Note that if 2; = 24, 
and @,(z;) 0, then action 7 could never be taken according to strategy ¢. 
We shall frequently describe a monotone strategy merely by the set of boundary 
values (2;). 

Let w; denote the change point of L; — Li4,. By assumptions C and D we 
have that w; S wi41,7 = 1,2,---,n— 1. 

LemMa 4. For any strategy ¢ = {¢;} there exists a monotone strategy ’ ips} 
such that 


[ pte | w) (x $5(x) ei 2 (2) ) du(zx) 20, for w Wi, 


j=l j=l } \ <0, for w W;. 


Proor. For each i with w = w; and ¢* = )>j_1¢,(x) by virtue of Lemma 3 
there exists a strategy 


such that 


(6) [ p(x | w) (vitz) nn Y #s(2)) aula) 


J j=l 


As w,41 2 wi, (5) yields 


0 f p(x | wou) (vila) — & oa) datz) 


> [ p@ | wis) (ve) = . 4,(2)) du(z). 


j=1 


Since 


(1, , © Bbas, 
Wisa(x) = } Mia, v= Vir; 
(0, i Visits 


diai(r) = Wias(z) = v(x), i = 0, 1, --- , nm — 1, where we have set Yo(x) = 0. 
It is an easy matter to verify that ¢ = {¢;} is a monotone strategy characterized 
by the boundary points z; and that the inequalities (5) are satisfied. 

THEOREM 1. The collection of all monotone procedures constitutes an essentially 
complete class of strategies.’ 


and on account of (6) and (7), we see that ¥;4;(2) 2 ¥.(x) and 244; 2 2; . Define 


| For the exponential case, this theorem was proved in [4]. 
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Proor. Let ¢ = {¢;} denote an arbitrary strategy, then the risk when nature’s 
choice is w becomes 


plo, 6) = f ple | a) D oi(x)Lile) dul) 


n—l a 

- ic i w) 4 = ¢:(x)[Li(w) — L,(w)] + L,.(w) ‘ du(x). 
\ t=1 / 

Writing Li(w) — La(w) = SOJZ) [Lj(w) — Lj4:(w)] and interchanging orders 

of summation, we get 


(n—1 


i \ 
(8) p(w, ¢) = [ ve | w) \& (L;(w) _- Lia1(w)) > (2) > + L,.(w) du(x). 
t=! j=l ) 
We seek to find a monotone strategy {6°} so that p(w, ¢) < p(w, ¢). The 
difference becomes 


p(w, ¢) — p(w, ¢) 


C n—l 1 
(9) = D [Lile) — Lisle) [ ple | 6) (> (x) — > 4%(z) ) dulz). 
J jul 


i=l 


j=l 


0 . 
By Lemma 4, the monotone strategy ¢;(x) is constructed so that 


| ve | w) b d(x) — > 65(a) | du(zx) { = 0, 


0, 


But, also, by assumption 


f 
Li(w) — Lisa(e) 4 0, or ms 
(3s 0, Ww < Ww. 
Consequently, every term in the sum of (9) is nonnegative and hence p(w, ¢) = 
p(w, ¢). 

REMARK 3. It is important to observe for later use that the monotone strategy 
¢ constructed to dominate ¢ depends only on the specified points w,; and in 
no other manner on the loss functions L; . 

More precise results can be obtained if the number of actions n equal 2. 

TueoreoM 2. Let o,, = X for all w and suppose L,; — Lz has precisely one change 
point w, and no other possible zeros, then every nonmonotone procedure ¢ 
(gi , 1 — dx) is dominated by a unique monotone strategy. 

Proor. In order for any strategy ¢ to dominate ¢, then according to (9) 


(10) 0 < p(w, 4) — plu, ¢") = [Lilw) — Lalw)] { ple | w)(r — 43)(z) dulz). 


Since both factors must change signs at the same point w, , we must have 


: ©, w 
0, w 


/ 


| ple | w)(dr — 8)(2)} 


Applying Lemma 3 leads immediately to the conclusion. 
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4. Bayes strategies. In this section we further assume that L; — L; (j > 7) 
has precisely one sign change. Also, we suppose that L; possess enough smooth- 
ness properties to ensure the existence of all integrals involving these quantities. 
Furthermore, assume that o, = X for all w. Let F denote a distribution function 
possessing more than one point in its spectrum. Define 


vila) = f ple | w)[Liw) — Ly(w)] dF) G> od. 


Let us suppose that the relation (2) holds with strict inequality. By virtue of 
Corollary 2, \;; changes sign at most once in the direction of negative to positive 
values and \;; has at most one zero. Therefore, if / represents a given distribu- 
tion for nature, then action 7 is preferred to action 7 whenever \,;;(z) < 0, and 
x was observed, while the reverse situation holds when \;;(x) > 0. If A\;;(x) = 0, 
then for that observed x the statistician is indifferent in choosing between action 
i and 7. Summarizing, action 7 is chosen over j provided x < 2 and action 7 is 
desired over action 7 when x > 2» for some appropriate 2» (xo is the change 
point of \,;). This analysis is valid provided j > 7. Therefore, for a given distri- 
bution of nature the optimal Bayes strategy requires that if x is observed and 
action 7 is favored over some other action j(j7 > 7), then for all larger x the same 
is true. This leads readily to the result that if action 1 is ever taken, then it 
must be taken when x < 2x, for some suitable x, . Continuing the same reasoning 
yields the existence of values x; such that 


{ 
}1 4 <2 < By 
(zx) = 4,’ : 
At) \o, <i gen OF > 2B, 


(xo = —©) represents the unique Bayes strategy against F. This optimal 
monotone procedure is unique with the exception of n — 1 possible values 
{e;,7 = 1, ---,m — 1} where randomization for the statistician may be al- 
lowed. Summing up, we have established 

TueEoreM 3. If F is an a priori distribution for nature which does not concentrate 
its full mass at a value w, where L(w) L ;(w), for somet <j, and L; — L; changes 
sign precisely once, then the Bayes strategy is monotone and uniquely determined 
except for at most n values of the variable zx. 

(It was assumed here that o, = X, and that strict inequality holds in (2)). 

Examples can be given of Bayes strategies in which some actions are never 
taken. Let Li(w) = w — 6; for w > 6,, 0 elsewhere; Lo(w) = 6; — w forw < 6,, 
w — 6 for w > 6,0 elsewhere; L3(w) = 0. — w for w < 62,0 elsewhere; where 
6; = 2,0. = 4, p(x | w) = « V2w2 = du(x) = ¢ / V2e dz,X =Q= (—~,-~). 
If F concentrates at 5 and —5 with probability 4 each, then it can be easily 
shown that the Bayes strategy has the form that action 1 is taken if e"* > 4 
and action 3 is taken when e'* < 34. Here action 2 is never taken. 


5. Admissibility of monotone procedures. To determine when the monotone 
procedures form a minimal complete class is, in general, very complicated. In 
this section we obtain several results which provide the answers to the question 
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of admissibility in some specific circumstances. These examples point out the 
dependence of the question of minimality on the loss functions involved. 
I. Admissibility for n 2 actions. Throughout this section we assume that 
X and that for 7; > 2 , w, > w, 2; in X, and o; inQ, 


P(X, | a)p(ae | we) > p(x, | we)p(xe | w1). 


THroreM 4. Jf the loss functions L; — Lz change signs precisely once with wo , 
the change point, such that wo is interior to Q, and L, — Lz possesses no other possible 
zero in the neighborhood of wo except possibly wo , then every monotone procedure is 
admissible. 

First we establish 

Lemma 5. Under the assumptions of Theorem 4 every monotone strategy involving 
both actions is unique Bayes (except for at most one value of x) against a two-point 
distribution F for nature. Moreover, the two points w, and we can be prescribed 
arbitrarily subject only to the condition w, < wo and w, > wo. 

Proor. If a: < wo and w. > wo, then by hypothesis Li(w:) — Le(w) < 0 and 
Li(ax) — Le(w:) > 0. Let @ (1 , | — ¢:) denote a monotone procedure deter- 
mined by the point 2 in X, viz., 


(1, 

(11) = } r, 2= : < 2» interior to X 
i 
| 0, 


Determine y by the relation 
(12) up(Xo | w1)[(Li — Le)(wr)] = (1 — u)p (ro | w2)[(L2 — Ln) (w)]. 


It follows from (12) that 0 < uw < 1. Let F be the distribution concentrating 
at w, with weight » and at w. with weight 1 — uw. Equations (8) and (9) yield 
the result that 


g(z) = | p(z | w)[Lr(w) — La(w)] dF (w) 


vanishes for x = 2. By Corollary 2 and Theorem 3, we infer that g(x) > 0 
for x > x and g(x) < 0 for x < x. This means that the unique procedure for 
the statistician in minimizing the risk is to take action 2 for x > 2 and action 1 
for x < 2.If x = 2x, then either action yields the same expected return. 

Proor or THEeoreM 4. It will now be shown that the monotone strategy 
described by (11) is admissible. To this end, if ¢* is a strategy dominating ¢ 
for all w in Q, then on account of Lemma 5, ¢* = ¢ except possibly for z = 20 . 
Let the value of $*(z0) \*. Observe that for all w 


0 = p(¢d*, w) — pld, w) p(xo | w)(A* — A)[Li(w) — Le(w)]uf xo}. 


Since L, — LL. changes sign in the interior of 2 and our assumptions imply that 
p(xo ,w) > Oexcept at the left endpoint of 2, (A* — A)uf ao} 0), so that p(o*, w) 
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o(p, w) for all w. The admissibility of the special monotone strategies 11, 0} 
and @ (0, 1} is a trivial fact to establish. This completes the proof. 

II. Admissibility for n 3 actions. The objective here is to study the case of 


3 actions. Such problems occur in many practical situations and are of interest. 
For instance, the two-sided testing problem is of this form where the loss for the 
alternative hypothesis depends on which side of the hypothesis the true param 
eter value lies. 

Throughout the remainder of this section we specialize the class of densities 
to the exponential family, i.e., p(x | w) B(w)e””. 

The following further restrictions are placed upon L, , Le, and L;. 

AssuMPTION E. It is required that L; , 7 1, 2, 3, do not grow exponentially 
and that they do not simultaneously in any region exponentially, approach zero. 

LemMa 6. Jf Assumption E is satisfied and Q contains infinitely many w values 
tending either to + or — infinity (for definiteness let w; — + © belong to Q), then 
every monotone strategy involving all 3 actions is unique Bayes (except for at most 
two values of x) against a three-point distribution F for nature. 

Proor. Let w; , we, and w; be chosen from the sets S; , S2, and S; (see con 
dition C) such that (L; — Li4:)(w;) ¥ O for i = 1,2 and 7 1, 2, 3. Consider 
the following system of equations in the unknowns };, Az, and A; with 22 > 2; 
prescribed and w, , w , and w; selected as above. 


hi(Li — Le)(w)e"™ - 
Ai(L2 - Lz)(w)e""*? 


+ 


ho(L _ Le) (we)e"*** + A3(Li —_ Le)(w3)e"*™* = 0, 
(13) 


+ Ao(Le — L3)(wo)e“*** + Xg(Le — Lz)(wn)e“**? = 0. 
The determinant 
(Li — Le)(we“** (Ly — Le)(wn)e***! (Ly — Le) (ws)e** 
(Le — Ls)(un)e“"*? (Le — Ly)(we)e**** (Le — Ly) (ws)e**” 
ay Ax az 


has the property that when the vector a (a; , 2, a3) is equal to the first or 
second row vector, then it vanishes. From this fact, we deduce that A, , 2, 
\; can be chosen proportional to the co-factors of the last row, respectively. 

Noting that (lL; — I)(w:) < 0, (Li — In)(w) > 0, (Li — Le)(w3) > 0, 
(Le — Ls)(w) < 0, (Le — Ls)(we) < 0, and (Le — L;)(w3) > 0, we readily find 
that the co-factors of a; and a; are positive. The co-factor of ae is 


—[(Ly — L2)(w1)(Le — L3)(ws)e*****°7? — (Ly — L2)(ws)(Le — Ls)(w Je“? ****"| 
errata, — Ly)(wn)(L2 — Ls)(ws)e“? 
-_ (Ly _ L2)(ws)(L3 - L2)(a)). 


In view of Assumption E, if w; is chosen sufficiently large then this last expres- 
sion is positive and hence 4. > 0. Put uw: = kA; / B(w1), we = hr2 / B(w), and 
Ls kX; / B(w;), and normalize y; by suitable choice of k so that Sou, | 
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with uy; = 0. Let F be a distribution concentrating yu; at w; ; then equations (13) 
become 


| e”""B(w)[L1(w) a L2(w)) dF (w) 


- 


| €°78(w)[La(w) — La(w)] dF (w) 


Corollary 2 and Theorem 3 imply that 


0, 
0, 


[ e**B(w)[Li(w) — Le(w)] aF(.) { [ 


we , ' f>0 x> XX 
, w)[Le(w) — Lelw W (uy) < : ay 
| €*B(e)[La(«) La(w)| dF) “ 9, ste. 


Consequently, the optimal procedure is to take action 1 when x < z;, action 2 


< 2 

for 27; < x < 2, and action 3 for x > z.. Indifference exists between actions 

1 and 2 for x = x, and between actions 2 and 3 for z = 2. Thus, if a monotone 
strategy @¢ = {¢1, d2, o3} with 

(1, x x | 0, Z< 2, 

di = 41, x x $3; = 41, tz <2, 

0, ; As, x= 2 


is prescribed, then we have constructed a three-point distribution F against 
which the given ¢ is the unique Bayes strategy with the exception of two possible 
values for the random observation x. The proof of the lemma is hereby complete. 

Tueorem 5. Under the assumptions of Lemma 6 every monotone strategy in- 
volving all 3 actions is admissible. 

Proor. If ¢* is any strategy which dominates the given ¢, then by Lemma 6 
@ = ¢* except possibly at x = 2x, and z.. Suppose u{2,} or w{2x2} is positive 
(otherwise, Theorem 5 is established). By (9) 


) 0 = p(g*, w) — plo, w) = Blw)fe™* (AT — Ar)uf{ai}[Lilw) — Le(w)] 
(14 


+ e7** (ro ad A2)u {Xo} [Le(w) —_ L3(w)]}. 


Letting w — + © and observing that the second term dominates, we conclude 
that (AZ — d2)u{xze} S O. Examining w in S; on account of (14) compels 
Or - Ai)u{ai} = O. Finally, evaluating relation (14) for w in S,, using the 
established facts that (AJ — A»)ufae} < O and (At — Ai)u{ai} = O, yields that 
(Al — Ar)ufar} = O = (AT — re)ufae}, whence p(¢*, w) — p(d, w) = 0 for all 
w, and the proof is complete. 

The above theorem does not treat the special monotone strategies which 
involve only two possible actions. These strategies will now be shown to be 
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admissible For. instance, suppose ¢ represents a monotone strategy which 
involves only actions 1 and 3. Precisely, let ¢:(z) = 1 forz < 2,0 forz > 2; 
¢3(x) = lforz > 2,,0forr < x,;and¢@ = 0. Suppose ¢° isa monotone strategy 
which dominates ¢ determined by the critical values a, te 
now be considered. 

Case 1. rt < 2, < 22. In view of (9) 


Three cases will 


71 
0 = p(¢’, w) — p(d,w) = —[Li(w) — La(w)]8(w) [. e** du(zx) 


+ [Le(w) — L2(w)|B(w ) [re * du(x) 


Since the second term dominates as w — + ©, we arrive at a contradiction. 
‘ ‘ 0 a 0 
Case 2.x, S 2 < 22. By (9) 


0 = p(¢,w) — plo, w) = [Li(w) — La(w) ja) e** du(x) 


0 


+ [Le(w) — L;(w)|B(w) / . e** du(z). 
z1 


However, this inequality is impossible for w in §;. 

Case 3. 21 < x2 S 2%. This case can be handled similarly to that of case 
2 above by examining w in S,. 

The other types of monotone strategies involving at most two possible actions 
are treated similarly. This argument can also be extended to yield the conclusion 
of Theorem 5. The result of Lemma 6 is, however, stronger and possesses inde- 
pendent interest. 

We now produce an example to show that when the state space of nature w is 
restricted to a finite range, the conclusion of Theorem 5 is not valid. Let @ 
(¢1 , $2, $3) be a monotone strategy given by ¢i(z) = 1 on zt S 2%, O else 
where; ¢2(x) = 1 ona, < x S x, and zero elsewhere, with¢; = 1 — g: — @e. 
We desire to construct a monotone strategy @ = {63 ,¢:,¢3} which dominates ¢. 
Let ¢ be determined by the critical values zy and x: with xz} < 2, and x: > a. 
Consider, according to (9), 


o(¢, w) is p(d, w) 


= B(w) 4 (Li — Ls)(w) | e**[oi(x) — di(z)} du(x) 


+ (L, — L;)(w) | ete: + $2: — oi — gal du(zx)) 


( z1 
B(w) ¢ —(L,; — L2)(w) I. *° du(x) + [(Le — L;)(w)] 


71 
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In the region S., (15) is automatically negative. In region S;, we choose 
loss functions so that 
Le) (w)| - e” du(x) 
| (Le — L3)(w) | > ———! 


0 


| ee” du(x) 


“re 


and L;(w) > Le(w). This can be done since w ranges over a finite interval and 
hence the integrals are continuous and bounded away from 0 and ©. Similarly, 
we determine in S; the loss functions 1, , Le , and L; so that 
oz 
Ly — L3)(w)) e" du(x) 
(L, — L.)(w)| > oe a with L;(w) < Le(w). 
e” du(x) 


zy 


With these determinations of the loss functions. L, , Le, and L; , we see that 
(15) is always negative for the region of w under consideration. Thus ¢ is not 
admissible. Intuitively, for any monotone strategy defined by critical values 
x; and x» , loss functions can be chosen so that one wants to take action 2 more 
often than prescribed by the given strategy. An example, where the natural 
range of the distribution is a finite interval and where the above construction is 
valid, is obtained by setting du(x) = e '*' dx. In connection with Assumption E, 
examples can be constructed to show that the growth restriction is essential in 
order to ensure the validity of Theorem 5. 

Ill. Admissibility for n = 4 actions. Our next task is to analyze the case of 
four possible actions. Again assumption E is to hold. 

Lemma 7. If the parameter space w contains arbitrarily large values of w tending 
to + and — infinity, then every monotone strategy involving all four actions is 
unique Bayes, except for three possible values, against a distribution F(w) involving 
four points. 

Proor. Let a monotone strategy ¢ be given, defined by the critical dividing 
numbers x; , 2, and 23(%; < x. < 2;). Choose w, < wo < w; < w, from the 
four regions S;, Se, S;, and S,. Consider the system of equations in the un- 
knowns A; , 2, Az; , and 4 given by 


1 
(16) DAML: — Ligs)(wje*** = 0, 


j=l 


As in the proof of Lemma 6, the solutions \; are proportional to the co-factors 
of the last row in the determinant 


(Ly, — Le)(aije“™' (Ly — Le)(wr)e*** ++ (Li — Le) (wae*™ 
(Le Lz) (wi)¢ — (Le _ Lz) (we)e*?”? 
(Ls L4)(w)e*"* -++ (Ls — L4)(ae*™ 


ay 9 a 
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Choosing w sufficiently large negatively and w, sufficiently large positively, 
the signs of the subdeterminants are determined by the signs of the principle 
diagonals. It readily follows that by such a choice of w and w, we get A; > 0, 
Ae > 0, As > O, and:Ay > O. Define yu, kr; / Blw;) with > B u land Fa 
distribution concentrating u; at w;. It is easy to see by (16) that the given 
strategy ¢ is unique Bayes against F with the exception of three possible values 
of x. 

Turorem 6. Under the conditions of Lemma 7, every monotone strategy is ad 
missible. 

The proof can be patterned, with some slight modification, after that of 
Theorem 5, and the remarks following Theorem 5. We omit the details. 

The assumption in Lemma 7 concerning the parameter values w extending to 
+ is essential. If this condition is removed, an example can be produced 
which permits w to range to +, and is bounded on the left such that suitable 
monotone strategies are not admissible. The construction is similar to that 
shown in connection with three actions where we make use of the exponential 
for w — ©. This example will be left as an exercise for the interested reader. 

In the case of 5 actions, even if the parameter range for w is the full infinite 
interval, loss functions can be defined so that not all monotone procedures 
are admissible. 

IV. Admissibility for n actions. We now present some examples of general 
interest for n actions where the monotone strategies are all admissible. Let the 
loss functions L; satisfy the conditions of A, B, C, and D and the further property 


(17) L(w) = Lias(w) for Wz S,; or Sas ‘ 


Of course, by condition D, L;(w) < Lj(w) for w ¢ S;. We shall now show that 
every monotone strategy ¢ (¢;) is unique Bayes except for n — 1 possible 
values against an n-point distribution. Indeed, select w, < we < w; < +°+ w,, 
and w; in S;. Note, L,(w;) — Lisslw;) < 0 and Li(wigs) — Diai(wins) > 0. 
Let the strategy (¢;) be described by the critical values z;(z; S 2;4,). Consider 
the system of linear equations in the unknowns \; : 


(18) _ AL; — Lia) (w,e*** ~¢ 


j=l 


The solutions \; are proportional to the co-factors of the last row of the deter- 
minant 


(Ly - La) (w)e"*”? (Li = L2) (wx)e"2*? 0 
0 (Le = L;) (co.)e°2*2 (Le —_ L;) (w,)e"378 
0 (Ls — Li) (we"*** 


0 0 
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The zeros appear in this determinantal expression as a consequence of (17). We 
deduce easily that \, are all of one sign. Put un; = kA; / B(w;) > O and 2 Li i 
Define F to concentrate weight yp; at w;. Equations (18), Corollary 2, and 
Theorem 3 imply that ¢ is unique Bayes, except for n — 1 possible values of z, 
against F(w). Thus we have established 

TuHeEoreM 7. If the loss functions L; satisfy the additional property (17), then 
every monotone strategy ¢ = {di} is unique Bayes, except forn — 1 values of x, 
against a distribution concentrating at n points w;. The parameter values w; can 
be chosen arbitrarily provided only that w; is in 8; . 

TueroremM 8. Under the conditions of Theorem 7, every monotone strategy is ad- 
missible. 

Proor. If ¢ = {¢,;} is a monotone procedure determined by the critical values 
2i(x; S 241), then by Theorem 6, ¢ is Bayes against a distribution concentrating 
on n values. If ¢* is a monotone procedure which dominates ¢, then by Theorem 
7,¢ = ¢* except possibly for z = 2z;. Let ¢*(z,;) = a! and ¢(z,;) = a;. Con- 
sidering w in S, we find, since 


, w21/ * ’ 
0 <= p(¢, w) — p(d*, w) = Blw)e "ar — ay \Lilw) — Le(w)|ufay}, 


/ * 
that (a; — a; ulm} S 0. 
Examining the risks for w in S., we get 


0 S B(w) fe" (a; — at )[Ly(w) — Lo(w)|ufas} + e** (a2 — a? )[L2(w) — L,(w)|uf{xo}} 


which implies as (a, — at )ufai} < 0 that (as — a:)ufze} < 0. Continued ap- 
plication of this analysis by successively looking at w in S; (¢ = 1, ---,n — 1) 
we conclude that (a; — at)ufz;} < 0. Finally, for w in S, , we conclude that 
(an-1 — a@n_s)u{z,} = O and working back we find successively that 
(a; — at)u{a;} = 0. Consequently, p(@, w) — p(¢*, w) = 0 for all w, and the 
proof of the theorem is complete. 

An important application of Theorem 7 arises when we consider the situation 
where L;(w) = a for w g S; and L;(w) = 0 for w e S;. In other words, the sta- 
tistician is penalized a fixed amount if the wrong decision is made independently 
of the action taken, with zero loss if the correct decision is made. Then the 
conclusion of Theorem 8 can be stated as follows: The collection of all monotone 
procedures form a minimal essentially complete class. 


This section is closed with the further enumeration of some examples of 
minimal essentially complete classes. In an n action case if Li(w) = |% — 7 | 


a for w in S;, then the collection of monotone procedures constitutes a minimal 
essentially complete class. This is a situation where the penalty of a wrong 
decision is proportional to how far the decision is from the correct action. The 
proof of this last fact is omitted. 


6. Minimax strategies for nature. In this section we characterize the form of 
the minimax strategies for nature in the case of two actions. The underlying dis- 
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tribution, as before, has the form 


P(x|w) = | p(x |w) du(x), 
=) 


where p satisfies the condition given in Section 1. We now require further that 
p(x | w) is continuous in w for each fixed x. The loss functions LZ; and Lz have 
the properties A through D (see Section 1) and are in addition continuous. 
Without loss of generality, they may take the form L;(w) = Oforw S 6, Li(w) > 0, 
w > 6, while Lo(w) > 0 forw < 6, Lo(w) = 0, w = 6, where @ is interior to 2. 
Let p(¢, F) denote the expected risk when nature chooses a distribution F, and 
the statistician follows the procedure ¢. Lebesgue’s convergence criterion, in 
view of the continuity assumptions on L;, implies that p(¢, w) is continuous 
in w. Let the smallest interval containing 2 be denoted by {a, b]. The points 
a, b may or may not belong to 2 and the values + are not excluded. If a 
does not belong to 2, then we assume that for any monotone strategy ¢ for 
which ¢ # 0, 


(19) lim La(w) [ [1 — o@)Ipl¢|o) du) = 0. 


wa 


Similarly, we assume that if b does not belong to 2, then for any monotone 
strategy ¢ with @ ¥ 1, 


(20) lim Ls(e) [ s@p(t| «) du(t) = 0. 
If a belongs to Q, then we do not impose condition (19). A similar statement 
applies to the endpoint b. 

If L; and Lz do not grow exponentially and the family of distributions belongs 
to the exponential class, then it is easy to verify the validity of conditions (19) 
and (20). 

Consider the game G,, defined as follows: Let w range over the interval [w, , ws] 
where w, < @ < ws. If a belongs to Q, we take w, = a and if b belongs to Q 
take w. = b. Otherwise let w, — a and w, — b. Let the strategy space for the 
statistician consist of all monotone strategies and let the strategy space for 
nature $, consist of all distributions on the closed interval [w, , 2]. The payoff 
is the risk p(¢, F) which is continuous in ¢ and in F in the usual weak topology 
imposed on these sets of strategies. The reason we can restrict ourselves to the 
monotone procedures in the consideration of the game G, is a consequence of 
Theorem 1. The following facts will be used in the course of the subsequent 
analysis: Suppose ¢,(x) — 1 for every x, where ¢,(x) = 1,x <2,,=0,2>2,. 
For any set of w where w ¢ 2n [a, a*] = W with both & and &* interior to Q, it 
follows that 


(21a) [1 = 60IpCe| «) duta) 
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converges uniformly to zero in W asr —  . Similarly, if ¢,(a7) — 0 where ¢,(x) = 0, 


~ 


a ie 1,z > 2,, then for any set w of the form JW, it follows that 


(21b) | $, (a) p(x | w) du(x) 


converges uniformly to zero. 

That we obtain pointwise convergence in (2la) and (21b) follows from 
Lebesgue’s convergence criterion. The uniformity of the convergence can be 
secured with the aid of the Collorary to Lemma C in Section 1. 

Since the spaces of strategies are both compact, it is a well-known result 
that the game is determined. Let ¢¢ and Fo denote minimax strategies for the 
statistician and nature respectively. If v, denotes the value of the game, then 


o(do , F) S », all F in &, , 


(22) 


p(o, Fo) = », all monotone @. 


Let 7, denote the set of all w satisfying p(¢o , w) v, . The set 7, cannot be 
fully contained in either of the intervals /; [w, , 6] or Iz (0, wa]. Indeed, 
if 7, C (6, wa], then since the spectrum of F> must lie in 7, by examining ¢ 
(gd; , 1 — ¢1), where ¢; = 0, we secure, using the fact that Le(w) 0 forw = 8, 
that p(6, Fo) = 0, an impossibility. Hence, 7, contains points of both intervals 
I} and I? . This last analysis also implies that the monotone procedure ¢o is 
not identically 1 or 0. Choose w in J? n T,, and win J? n T, . In view of Lemma 
5, there exists a distribution F"(w) with spectrum consisting of a, and w. such 
that ¢) is Bayes against PF”. Since w, and w belong to T, we get that v 
p(¢o , F") = ming p(¢, F”"). Hence F” is a minimax strategy for nature involving 
only two points. Allow n to go to infinity and select a limit monotone strategy 
é. = lim;.«¢0'. It will now be shown that ¢ = (¢@, 1 — ¢:) cannot have 
¢: = 0 or ¢; = 1. First note that by choosing any two-point strategy F(w) 
for nature, it follows that v, = a > 0. Consider the case where oy m= 1. If ti, 
represents the critical dividing point for the strategy ¢;‘, then z,, must converge 
to the right-hand endpoint of X. But, 


n 


p(do',w) = Le(w) | (1 — oo") p(x | w) du(x) 


tends uniformly to zero for w < 6. This is a consequence of assumption (19) 
and equation (21a). As v,, 2 a > 0, we deduce for n; sufficiently large that 
T.,, must lie wholly in the interval J?‘ , contradicting the fact established above 
that T,,, intersects both J?‘ and J?‘. A similar argument using (20) and (21b) 
eliminates the possibility that ¢1 = 0. This completes the proof of the assertion 
made. 

In view of conditions (19) and (20), we find easily that p(¢', w) ~ 0 as w—- a 
and b, if a and b do not belong to ©. It is clear that there exists infinitely many 


n, which we enumerate through m such that 47 < ¢{ or ¢f = ¢;. Without 
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loss of generality, let us consider the case where o; < ¢,. It follows now that 


for all m > mp there exists a subinterval U’ lw’, w | depending on e < a wit! 
” 


v’,w interior to 2 such that forw g U and m > mo, p(@”",w) S ¢ < a. Conse 
quently, all the two-point distributions F” constructed above have the property 
that their spectrum is simultaneously contained in UU. A limit two-point dis 
tribution F can now be selected with spectrum in U as U is compact. By con 


sidering an appropriate subsequence n = I, it follows that lim ~ = vi, plod, F * % 


kw 
for every F, and p(¢,;F) = v for any monotone strategy ¢. These last inequalities 
can be expressed in the following theorem. 

Turorem 9. /f the loss functions and distributions satisfy (19) and (20), then 
the game with risk function p(o, F) as @ ranges over all strategies and F ranges 


over all distributions on Q is determined, 1.e.. 


23) min max p(¢, F) max min p(¢, F). 

‘ F P rs 
V oreover, there exists a minimax strateqy F for nature invol ing only two points of 
ziicrease. 

\ careful study of the proof actually shows 

Coro.uary. If (23) holds and there exists a minimax strategy for nature, then 
there exists a minimax strategy for nature with a two-point spectrum. 

The conditions (19) and (20) were imposed to ensure the determinateness 
of the game plod, F) as given in (23). 

Another type of result is to study Bayes distribution for nature against a 
given monotone procedure. We limit ourselves for P(x | w) to the exponential 
class and we assume that L;(w) are such that (19) and (20) hold. For simplicity 
put 6 0. 

The yield, if the statistician employs the procedure ¢;(x) 

x > 2X, becomes 


ezo 


| B(w) La(w) | e * du(x) 


H(w) 


| B(w) Le(w) / e“ du(x) 


“ro 


For definiteness, let us assume that the endpoints are +« when open at that 
respective end. By (19) and (20), asw—> » orw— — * H(w)— 0. Also H(O) = 0. 
If L; were analytic, then H could achieve its maximum at most a finite number 
of times and since it is a Bayes distribution with probability one at the maximum 
values, every Bayes distribution ayainst a monotone procedure involves only a 
finite number of points. Examples can be produced which show that the number 
of points involved in a distribution may be more than 2. However, for certain 
distributions and suitable loss functions we can show that every Bayes distribu 
tion involves at most two points. 

EXAMPLE 1. Suppose that p(z | w) 1 / (\/2n)e 


1/2(z 


and forw > 0, 
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Li(w) satisfies the condition that L;(w) / Li(w) is nondecreasing and that 
’ , ‘ ° ° » 
L2(w) / Le(w) is nondecreasing, in w < 0. We now show that 


rz , p* 
Zw ] r = [ —1/2(z 
| € du(x) \/2n P é 


Léa — = [y(w) = 


x l aw 
du (x) - 
Vv 2 00 
= Ly(w) &(2x9 2 
has a unique maximum for w = 0 where @ is the cumulative standard normal 
distribution. Differentiating (24) yields 
~\ , Vy 
(25) — Ly(w)®’ (2p — w) + Li (w)P(29 — w). 
Dividing by L;(w) and #’(2) — w), we have 
L;(w) (x — w) 
a ae 
Li (w) ® (2) — w) 
It is an easy matter to show for w > 0 that the second term is decreasing and 


thus (25) has at most one zero. 
Hence for w > 0 (24) has a unique maximum. A similar analysis shows that 


Lo(w)B(w) [ e° du(zx) 


has a unique maximum for w < 0. This implies that for any monotone strategy 
for the statistician the Bayes strategy for nature concentrates at most at two 
points. 

EXAMPLE 2. Imposing the same assumptions on the loss conditions, we now 
give a general sufficient condition on the distribution to ensure a unique maxi- 
mum in each of the regions w = 0 andw < 0. Let 


0 
[ e* d(x). 


By (23) 
H(w) = B(w)Bz,(w)Li(w) for w> 0. 


Let m(w) = 8(w)8,,(w). We deduce by analogous reasoning to that of Example 1, 
that if —m(w) / m’(w) has the property that it is strictly monotone decreasing 
for w > 6, then H has a unique maximum as H(0) = 0 and we make assump- 
tions on 1, so that H(w) — 0 as w — «~. However, m(w) / m’(w) is increasing 
if and only if log m is concave. But 
log m(w) = log B(w) + log B.,(w). 

Let y denote the random variable with density B(w)e"* with respect to u and 
Yz, denote the random variable with the same density except that we alter u 
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so that u(z,, ©) = 0. In other words, y,, is the truncated random variable 


where we do not allow observations x > 2». The variances of these random 
variables will be denoted by 


9 9 
u(Y) and Ta(Yro) 
when the true parameter value is w. Since d’ / dw log B(w) = o.(y), we obtain 


that log m has a negative second derivative if and only if 
(26) ony) > oo(Yzo)- 


Thus if (26) holds for every 2» , then H(w) has a unique maximum. In order for 
H in (23) to possess a unique maximum for w < 0, a corresponding condition to 
(26) must be satisfied when the random variable is truncated on the other side; 
namely, when we take u(— ©, x») = O. Thus a sufficient condition for at most 
two maxima for H/ is that (26) hold for every truncated variable on the left 
and right. Some instances where (26) holds for the exponential class are the 
normal, gamma, Poisson, and binomial. However, examples of exponential 
distributions can be constructed in which the seemingly natural condition (26 is 
not satisfied. We leave to the reader the task of constructing such examples. 


7. Essentially complete classes for infinite number of actions. An analysis 
of essentially complete classes of decision procedures for distributions with 
monotone likelihood ratio for an infinite number of actions will now be carried 
out. The parameter space w for nature will range over an interval 2 as before. 
The action space A will also consist of a closed subset of the real line. The loss 
function L(w, a)(w in Q, a in A) is assumed to satisfy the following properties: 

(i) For each w, L(w, a) attains its minimum as a function of a at a point 
a = q(w) which is a monotone increasing function of w. 

(ii) For each w, L(w, a) as a function of a increases away from that minimum. 

Without loss of generality we may take L(w, g(w)) = 0 for every w. 

Particularly important examples of decision problems whose loss functions 
satisfy (i) and (ii) are furnished by the estimation problem. Here a commonly 
used loss function is given by L(w, a) = |w — a|*(k > 0) where both w and a 
traverse the infinite real line. The function q(w) is evidently equal to w. 

A decision procedure or strategy for the statistician in this case is a proba- 
bility measure v(x) on A specified for every observation x. A monotone strategy 
is defined in this general situation as follows: If x, > x2 and C,; and C2 are open 
sets with C; lying to the left of C2 , then either 


v(Cy | 21) = () or v(C» 2) = (). 


This definition agrees with the meaning of monotone strategy for a finite number 
of actions given previously. In the case of convex loss functions (e.g., the estima- 
tion problem introduced above) where nonrandomized strategies are frequently 
employed, we obtain that a monotone decision procedure can be identified with 
a function ¢ with values in A such that ¢ is monotone nondecreasing. 
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The risk for @ and a given strategy v has the form 
p\w, v) I| L(w, a) dv(a X)p\x | w) du(x). 

Our main objective in this section is to establish the essential completeness 
of the monotone strategies for the case where A is infinite and closed. The proof 
will be carried out in three stages: First, the theorem will be demonstrated for 
the case that L(w, a) is continuous and bounded in a for each fixed w; second, 
when L(w, a) is 0 or 1, and then the general case. 

The method of proof for the case where L(w, a) is continuous in a entails 
a limiting argument by using the essential completeness theorem for a finite 
number of actions. 

Let ¥ be the collection of all non-null finite subsets of A partially ordered by 
set inclusion, i.e., B 2 C if and only if B C, where B and C are members 
of F. Let 5C be any subfamily with the properties: 

(a) If Be X and C ¢e 5, there isa D in *X such that Bu C s D. 

(b) U, 


The family of finite sets 5C in view of (a) and (b) form a directed system and 


no B is dense in A. 
therefore we can speak of convergence with respect to le 

We shall construct for every B in 5% a new loss function Lz(w, b), preserving 
the monotone properties of assumptions A through C for the case of a finite num 
ber of actions. For a given decision procedure v, we shall then construct a new 
decision procedure vg concentrated on B. This will have the property that vz, 
converges to v as B gets large and that if L is bounded and continuous in a for 
each w, p(w, vg) converges to p(w, v). With the aid of Theorem 1 we can then 
produce a monotone strategy vz concentrating on B better than vg and if we 


*) with p(w, v*) < 


take v* as a cluster point of vn » p(w, vr) will converge to p(w, v 
p(w, v) and »* a monotone procedure. The conclusions will then be extended 
to the case of loss functions L(w, a) not necessarily continuous in a. The above 
discussion indicates the direction of the proof; we now proceed to develop the 
details. 

For each a in A, define J,(a) to be the smallest closed interval whose end- 
points are in B and which contains a (the endpoints may coincide), if there is 
such an interval. If there is no such interval, let J,(a) be the set consisting of the 
nearest element of B. 

For a in A and b in B, put 

if b £ I ,(a), 
fx(a, b) , if J,(a) = [b,c] or | 


a = b+ (1 — Ade. 


The function fz(a, b) will be used to distribute the probability of a over B. In 
fact, set 


(28) va(S|z) = >. | toa, b) dv(a| x). 
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The probability of (a | x) concentrated at a is distributed to the points b and ¢ 
of B, where a ¢ [b, c], proportional to the distance from a to b and c respectively 
In forming vg(S | x) this is done for every a in A. 

The loss function is now altered as follows: Set 


L(w, b) if bg I x(q(w 


(29 Lelw b) 2 
I , V0 0 if b € I 3(q(w) 


The function Lalo, b) is only different from Liw, b) in the neighborhood of q\w). 
Moreover, L(w, b) is changed at most for two adjacent values of b and it is readily 
seen that for each fixed w, Lg(w, b) converges uniformly to L(w, b) on B. 

It is readily verified that the loss function Lg(w, b) satisfies the conditions of 
(A) through (D). By Theorem 1 there exists a monotone procedure v} concen- 
trating on B such that for the loss function Lg(w, b) 


* 
pa(w, vs) S paw, v*). 


It is important to note that the constructed monotone strategy vz depends only 
on vg and on q and not on the nature of the loss functions elsewhere. This was 
pointed out in the remark following Theorem 1, for the change points used there 
can be made to depend only on q. 

Since the space of measures is compact in the weak * topology we can 
select a measure »*(a |x) which is a simultaneous cluster point of v3(a \ x) for 
every x. In view of the continuity of L(w, a) as a function of a, we get for every w 


p(w, vz) = | L(w, a) dv5(a| x)p(x | w) du(x) 
(30) 
~ I L(w, a) dv*(a| x)-p(x|w) du(x) = pw, »*). 


As Lz(w, b) converges uniformly to L(w, b), we also find that 


pa(w, vz) = I| L2(w, a) dv;(a | x) p(x | w) du(x) — p(w, Vz) 


(31) . 
= II L(w, a) dv, (a | x)p(x | w) du(x) 


can be made as small as we desire for each fixed w by choosing B sufficiently 
large. Our next task is to show that p(w, vz) — p(w, v). To this end, define for 
fixed w 


(32) i(s) = / v(s | x)p(x | w) du(z). 


The set function 7 is a probability distribution on A induced by the action » if 
w is the state of nature. Then clearly 


(33) plw,v) = [ L(w, a) dv(a). 





296 SAMUEL KARLIN AND HERMAN RUBIN 


We also see from the construction of vg that for each b in B 


va((—~, b)| x2) S vo((—@, db) | 2), 


34) 


va((— ©, b] | x) 
The same inequalities persist for 7, and >. Hence for any ¢ in Use B, we have 


lim sups ¥p((— ©, c)) S W(—~, c), 


(35) Frid ve 


lim infg ¥g((— ©, c]) = 7(—~, c] 


and hence 7g converges to 7 in the sense of measures. As L(w, a) is continuous in 

a, we now conclude from (35) that p(w, vs) — p(w, v). Combining (30), (31), 
~ + * . . 

(35), and the fact that ps(w, vz) S pa(w, vz) yields finally that 


(36) p(w, v*) S plu, v). 


Again, we emphasize the fact that »* depends only on » and g and in no other 
way on the nature of the loss functions. We next show that »* is monotone and 
completely additive. 

(a) Proof that v* is monotone: Let x; < 22, C; and C; be two open sets where 
C, lies to the right of C. . For any given e > 0 there exists a B such that 


va(C;| ai) > v*(C; | zi) — €, i L 2. 

Since for every B either v3(C, | 2:) or v5(C2 | x2) is 0, it follows that v*(C, | 2) 
or v*(C2 | x2) is 0. 

(b) Proof that »* is completely additive for almost all x for each w. Consider 

K(»*, x) = 1 — limy-e »*([—N, N] | 2). 

Let L,,(w, a) be a sequence of new loss functions with the properties: 

(1) La(w, a) increases to L,(w, a), 

(2) lim,s, Ln(w, a) = n, 

(3) | Lal, a) di(a) < ~, 

(4) L,(@, a) = L(6,a) for @ ¥ w. 
For any completely additive measure there always exists a function increasing 
to infinity at the endpoints such that (3) holds. The L,.(w, a) are so constructed 
that each remain bounded but converge to L,(w, a). The loss functions for L(6, a) 
are not altered for 6 ~ w. On the other hand, only L(w, a) is replaced by a se- 


quence of L,(w, a) tending to infinity at the endpoints preserving qg. Since the 
v* depended only on v and gq, we obtain 


pn(w,v) = palw,v*) = I L,(w, a) dv*(a | x)p(zx | w) du(x) 


> (n — &) | K(*,2)p(e | «) du(2). 
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In view of (3) and (1), we find that p,(w, v) < © and therefore 


| K(v*, x)p(x | w) du(x) = 0. 
Consequently, K(v»*, x) = 0 for almost all x for each w. But this is equivalent to 
the countable additivity. 
Combining all the previous conclusions leads to 
THEOREM 9. Given any decision procedure v, there is a monotone decision 
procedure v* depending only on v and q such that if L(w, a) is a bounded continuous 
function of a for each fixed w satisfying (i) and (ii) with the prescribed q, then 


< 


p(w, v*) p(w, v) 


for all w. 

From the easily seen fact that every function which is 0 on J; and 1 on J, u / 
where J; , J; , and J; are disjoint intervals covering (— ©, ©), can be approxi 
mated by a sequence of continuous functions which are all 0 at a specified point 
of J; , monotone away from that point, and bounded by 1, and from the Lebesgue 
convergence theorem, it follows that the »v*, whose existence was established in 
Theorem 9, also works for all loss functions which only assume the values 0 and 1 
on intervals of the form J, , J, , and J; specified above. 

Now if \ > 0, we define 


‘oO. if L(w, a) < X, 


Le, a) = 4 
ll if L(w, a) = 2. 


, aid d j . 
On account of property (ii) for L(w, a), we see that L’(w, a) is 0 on an interval 
I, and 1 on two intervals J, and J; , all disjoint, which together cover (— ©, ~). 


Thus 


pr(w, v*) & palo, v). 


But, 


L(w,a) = [ L(w, a) dx, 


“0 


and since the order of integration can be reversed, we have 


p(w, v*) S pla, v). 


Thus we have established 

TusRoREM 10. Given any decision procedure v, there is a monotone decision 
procedure v* such that for all monotone L(w, a) satisfying (i) and (ii) with a given 
q, v* dominates v.’ 


? This theorem was proved by a more complicated method in [5]. 
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8. Bayes strategies for the case of an infinite number of actions. In addition 
to the conditions imposed on L(w, a) in the previous section, we assume that 
for any two actions a; < a2, L(w, a) — L(w, a2) changes sign at most once in 
the direction of negative to positive values and has at most one zero. For example 
this condition is satisfied when L(w, a) w — a) (k > 0). The conditions 
(i) and (ii) almost imply this requirement. Furthermore, we require here that 
a. = X and the inequality in (2) be strict. 

By Lemma 1, for any two actions, 


(37) palx) — pa (x) = | [L(w, a) — L(w, a’)| p(x | w) dF (w) 


changes sign at most once in the direction of negative to positive values. Hence, 
if, for a given 2, Ming pa(Xo) is achieved for a set A(x) with g.l.b. equal to ao , 
then fora < ap and x > 20, by virtue of (37) we have 


Rte) << pa(x). 


Thus the minimum of p,(z) for x > 2 is attained for a set of values of a with 
a = a. Since any Bayes strategy must for x concentrate its full measure in the 
set A(x), we deduce from this last fact that the Bayes strategy must be a mono 
tone procedure. Thus we have shown 

TuHeoreM 11. Jf L(w, a) satisfies (i) and (ii) of Section 7 and if L(w, ay) 
L(w, dz) changes sign at most once for a, < d2 and has at most one zero, then any 
Bayes strategy against a distribution F is a monotone procedure. 


9. Invariance and monotone strategies. Suppose a statistical procedure 
satisfying the monotonicity requirements is also invariant under a group of 
transformations, i.e., there is a group G such that each element g in G generates 
three mappings gx, go, and g, of X, Q, and A, respectively, into themselves 
satisfying the following properties: 

(a) The mapping of g in G into (gx , ge, ga) is a homomorphism. 

(b) P(gx(S) | go(w)) = P(S | w) for any measureable set S in X and w in Q. 
Of course, gx transforms measureable sets of X into measureable sets, and 
conversely. 

(c) L(go(w), ga(a)) = L(w, a) for every w in 2 and a in A. 

A decision procedure y(a | x) is called invariant if for a T-set in A and x in X 
and any ginG 


v(ga(T) | gx(x)) = vo(T \ x). 


To relate the monotonicity hypothesis in this paper to the invariance, we also 
require that all three functions gx(x), go(w), and g,(a) be monotone in the same 
direction. 

The proof of the next theorem and other results connecting monotonicity 
and invariance will be deferred to a future paper where extensive details will 
be given. This last theorem is stated only for the purpose of providing here a 
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compiete discussion of the theory of stati tical decision problems with monotone 


likelihood ratio 

THEORI M If a statistical decision proble m with monotone likelihood ratio is 
invariant under a group and the loss function L(w, a) satisfies the monotonicity 
requirements of section 7 with ga\q(w)] q\ga(w)| for each g in G and w in Q, then 


the class of monotone invariant procedures is essentially complete in the class of all 


mvariant proce dures. 
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ON THE POWER OF CERTAIN TESTS FOR INDEPENDENCE IN 
BIVARIATE POPULATIONS! 


By H. 8. Konign 
University of California, Berkeley 


Summary. Let Fo denote the joint distribution of two independent random 
variables Yyo and Z,o. The paper investigates properties of the joint distribu- 
tion F, of the linearly transformed random variables Y, and Z, . Let 3o be the 
Spearman rank correlation test, 3, the difference sign correlation test, 3. the 
unbiased grade correlation test (which is asymptotically equivalent to 30), 33 
the medial correlation test, and ® the ordinary (parametric) correlation test. 
(Whenever discussing ® we assume existence of fourth moments.) Properties 
of the power of these tests are found for alternatives of the above-mentioned 
form, particularly for alternatives ‘‘close’’ to the hypothesis of independence 
and for large samples. 

Against these alternatives the efficiency of 3; is found to depend strongly on 
local properties of the densities of Yyo and Z,o , which should invite caution; and 
the efficiency of 3, with respect to 39 is often unity. 

Incidentally, Pitman’s result on efficiency is extended in several directions. 


1.1. Introduction. In the investigation, for a class of problems, of operating 
characteristics of tests of statistical hypotheses, the crucial point is the specifica- 
tion of a class of alternatives which is (i) sufficiently wide to include some ap- 
proximation to any situation that may arise in this class of problems, and (ii) 
is manageable mathematically. 

For testing the hypothesis that two samples are from the same population, this 
point has been dealt with—with some measure of success—by specifying as 
alternatives the cases in which the two populations differ by a location (shift) 
parameter but otherwise can have any continuous distribution. This seems a 
satisfactory idealization for a class of problems, and is easy to handle mathe- 
matically. 

The situation cannot be expected to be so simple for testing of the hypothesis 
of independence in bivariate populations. As a matter of fact, in many applica- 
tions it seems that because of the bewildering variety of possible ‘‘modes of 
dependence” it is not feasible to provide a reasonable specification of alterna- 
tives satisfying (i) and (ii). This paper makes an attempt to open up this topic 
by considering a rather narrow class of alternatives for which (ii) is satisfied, 
though the extent to which (i) is satisfied is much more doubtful. Another class 
is considered in [8]. 

Received August 31, 1954. 

1 Presented at the Pittsburgh meeting of the American Mathematical Society, Decem 
ber, 1954. 
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The class of alternatives considered is one under which the two random varia- 
bles have been obtained by a linear transformation of two independent random 
variables. Cases of random variables which could have common but unobservable 
components may conform to this situation. 

Thus, suppose that the outcome of the application of a battery (@:, @:, ---) 
of psychological or psychophysical tests to a group of people has been subjected 
to a factorial analysis, revealing several independent common factors 


F,, +++ FB. 


Suppose that the analysis shows that apparently the outcome A;, of @;, is 
practically determined by F, and A;, of @;, by F2 . Psychological tests, especially 
aptitude tests, are often designed to achieve such “factorial purity” [4]. It may 
then be reasonable and desirable to identify F; operationally with A;, and F, 
with A;, . Before doing this, one should make sure that A;, and A;, are independ- 
ent random variables. If c > 2, let us assume that for @,;, and @;, we can ignore 
F;,-+--,F., or that those among the latter set which affect A;, do not affect 
A;, and vice versa. Then the above description implies that there exist two in- 
dependent random variables Y and Z and numbers A; , Az, As, As, Such that 


Ai, = MY + ed, 

Ai, = As¥ + AZ, 
and the hypothesis to be tested can be written 
Az = As = 0. 


The relative asymptotic efficiencies of the Spearman and medial correlation 
tests with respect to the ordinary correlation test have been previously con- 
sidered heuristically by Hotelling and Pabst [6] and Blomqvist [1], respectively, 
for normal alternatives; as far as is known to the author, no other investigations 
of the relative efficiencies of the tests discussed here have been published for 
bivariate distributions which stay constant during the sampling process. 

The present paper also contains an extension of Pitman’s result on local asymp- 
totic efficiency, which is believed to be of interest in itself. 

The author wants to express gratitude to Professor E. L. Lehmann for posing 
the problem and for numerous suggestions. 


1.2. Local asymptotic efficiency according to Pitman and an extension. Any 
statistical hypothesis and its alternatives can be described by two classes of 
probability distributions, Fo and F, , respectively. 

We are interested in cases in which there is some natural way of generating 
the elements of $, from those of %» in such a way that each F ¢ So is obtained 
as a limit of a sequence of elements of the corresponding subset of 5, . This can 
be formalized as follows: 
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Let Fo and 5, have the property that there is a set @ of transformations A 
from one to the other such that (1) 


yo U Y 


where Q@[F] U.cAF and where F ¢ % implies F ¢ @[F]; (2) there exists a 
K-dimensional metric space T of points y (v1, °-** , ¥x), Which (a) contains 
a point y and (b) for each F ¢ 5» has a subset I'(F) (containing y’ as a limit 
point) whose elements are in one-to-one correspondence with those of @[F}, 
with F corresponding to y . 


> ‘ 


” 


Derrinition. Given two nonnegative numbers a’, a” for whichO < a’ + a 
+, let 3 = {5,} be a sequence of tests of the hypothesis 7 y’, that is, of F, 

F ,o(=F). Let 3 be such that for n observations, 3, rejects the hypothesis if and 

only if the statistic 7’, does not exceed a maximal constant (or random variable) 

tl.» or does not fall below a minimal constant (or random variable) ¢”, » for which 
Gar = P{T. Star |F}, ane = P{T. 2 tar|F} 

do not exceed a’ and a”, respectively, and converge to these numbers. We then 

say that 3 is an (a’, a”)-level test (sequence). 

For tests based on ranks, anp , ar, tar , and ¢;» are independent of F when F 
is continuous and hence are “distribution free.” 

In the following let FY ¢ % and a’ and a” be fixed. To obtain at least a certain 
fixed power 8 > a’ + a” for the test sequence 3 under the alternative y, we have 
to choose the number n of observations so large that the probability 8, of reject- 
ing the hypothesis with test 3, undery is at least 8 (and shall otherwise choose n 
as small as possible). For reaching the same power 8 with the test sequence 3* un- 
der the same alternative, we have to choose the minimum number n* of ob- 
servations sufficiently large for 8%. , the probability of rejecting the hypothesis 
with test 3%. under y, to be at least 8. Now for a fixed 8, if we wish to let n in- 
crease indefinitely, we have to allow y to vary with n: 


y = y(n). 


In particular, if 5 and 5* are consistent (a’, a” )-level tests, the sequence {y(n)} 
of alternatives must converge to 7’. Then, when lim,... n/n* exists, it is reason- 
able to call it the local asymptotic efficiency of 3* with respect to 3 against a 
sequence {y(n)} of alternatives with elements in ! — y’. We shall now enumerate 
some conditions which suffice for its existence and allow us to calculate it. These 
slightly generalize those first given by Pitman (see [11]), who examined only 
cases with a’ or a” equal to 0 and K 1, and possessing certain other simple 
features. 

DEFINITION. Suppose there exist h > 0 and functions y, and x, over T such 
that for any k < K 


(i) lim n+ P{(T, _— ¥aly’)]/xnly’) Stivj = ( } f or dx; 


(ii) Wne(y) OW, (y)/d yx exists in a neighborhood of 7’; 
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(iii) nny’) xn(y') converges to a constant d, ; 
(iv) the alternatives y(n) are such that y(n) = ye +n’ -c + o(n ") 
(which defines neighborhoods of y’); 
(v) xnly(n))/xaly’) converges to 1; 
(vi) for y’(n) on the ray from 7’ to y(n), Wne(y’(n))/bnr(y’) converges to 1; 
t 


(vii) limypse P{[T. — ¥aly(n))]/xnly(n)) S t| y(n)} = (2x) [ ede. 


Let 
K 
A(c) = Dad. 
k=l 


rh e ” ° ° ° 0 
The class of all (a’, a” )-level tests, consistent for testing y = y against y ~ 7 


near y° and such that for a given h there exist functions y, and x, over T satis- 
a a ‘ . 1) - 
fying these properties, will be called @{7.*(h, I). The set of vectors c generated 


by the set of all values 8 > a’ + a” will be denoted by TP. 

Turorem 1.1. Let 3¢ Oo ar(h, T), 3* € OO? ge(h*,) I, and let A(c) or A*(c) 
differ from zero. Then for any sequence of alternatives with elements in T —»’ 
for which c ¢ T the local asymptotic efficiency of 3* with respect to 3 against this 
sequence exists and equals 0 if h* < h, © if h* > h, and 


{ A*(c)/A(c)}" 


otherwise. 


OUTLINE OF Proor. Write 


t 
&(t) = (2x4 [ e* der. 


wo 


We have 


ligase P{ITn — Yay V/xnlr") S [te — Yao V/xalr’) [9° =o’, 


limnse [th — waly’)]/xalve) = 8’, 
where 6’ = &7'(a’). Similarly, if 8” = @"(1 — a”), 
limnse [2 — Waly’)I/xaly’) = 8”. 
Now for0 < # < 1 
limps [tn — ¥aly(n))]/xny(n)) 


lim,+al[t’, o_ Vn(7’))/xnly(n)) - lim,, 2 a [ce + o(1))n™ 


Varl(O(y(n) — y°))/xnly(n)) 
5’ _ A(c), 
lim poelln — ¥aly(n))]/xnly(n)) = 8” — Ale). 





304 . 8. KONIJN 


But 1 — 8, equals 
Pi [tn — vnly(m))I/xnly(m)) S (Tr — Paly(r))I/xn(v(n)) 
S [tn — waly(n))\/xaly(n)) | y(n)}, 
so that 
1 — 6 = lim,.. (1 — Bn) = &(6” — A(c)) — (8’ — A(c)) = Var ar(A(c)) 


(say). 
We wish to determine n* as a function of n in such a way that for the same al- 
ternative, 


y(n) = y*(n*), 
both 3 and 3* reach (as closely as possible) the same power f. Let h* = h (the 
other cases are handled similarly). By assumption (iv) 


y(n)~ 7 +n” -c, 


y*(n*) ~y +n” - c*, 


—h 
c* ~ (n/n*)™ +c 


’ 


K 


A*(c*) ~ (n/n*)™ s: dt = (n/n*)*A*(c). 


keawl 
In the same manner as above, we obtain for 3* 
1 — 6 = limyse (1 — Bae) = Var ,ar(A*(c*)), 
so that, since V, .” has an inverse, 
A*(c*) = A(c). 
Consequently, 
n/n* ~ {A*(c)/A(c)}*. 


For sequences {y(n)} for which both A(c) and A*(c) equal zero, Theorem 1.1 
yields no result. As noted also in [12], it is desirable to be in a position to expand 
¥n(y) about 7’, using terms of order higher than the first. We therefore give the 
following definition: 

DerFtnition. Suppose there exist h > 0, a smallest integer p, and functions 
¥, and x, over I’, such that for any set (k; , --- , kp) of p not necessarily different 
integers S K 


(i) Tiyee P{lTn — val V/xalr) < t| 7°} = @x)4 [ eae; 


x 


(ii) Wonks -+-kp(Y) = DW n(v)/OVe;---OVk, exists in a neighborhood of 7’; 
(iii) Wks -+-ky(¥ )/Xn(Y) converges to a constant By, - chy 3 
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(iv) the alternatives y(n) are such that y(n) = yz + nce + o(n™) (which 
defines neighborhoods of y°), and if ¢ = (1, ---, cx), then 


A(c) _ (p!)~ Cr, ° ° "Ch, Bhy---ky x 0; 


(ky***,kp) 


(v) xn(y(n))/xn(v°) converges to 1; 


(vi) for y(n) on the ray from + to y(n), Wank y--eky(¥ (2)) /Wanky skp (Y’) converges 
to 1; 


(vii) limase P{[Tn — ¥nly(n))]/xn(v(n)) S t| y(n)} = (2m) | eo” de. 


The class of all (a’, a” )-level tests, consistent for testing y = 7° against y + 7 
near 7° and such that for the same h, p there exist functions y, and x, over T 
satisfying these properties, will be called @{”’.-(h, T). The set of vectors c gen- 
erated by the set of all values for 8 > a’ + a” will be denoted by I. 

THEOREM 1.2. If Je Oo (h, l) and 3* ¢ 0 ?”).(h*, I’), then for any sequence 
of alternatives for which c eT, 

lim n/n* 


n-o 


exists and equals 
{A*(c)/A(c)}""” if p* = p and h* = h, 
0 if p* = pand h* S h without both being equalities, 
© if p* < pand h* = h without both being equalities. 


We then define this as the local asymptotic efficiency e(3*, 3) of 3* with respect 
to 5 against a sequence of alternatives with elements in ! — y’ for which c e I, 

Remarks. (a). When K = 1, A*(c)/A(c) does not involve c, so that the effi- 
ciency of 3* with respect to 3 does not depend on the particular values taken on 
by a’, a” or 8, being the same against all sequences of alternatives with elements 
in ! — 7° for which c ¢ I. This is not generally so when K > 1. The dependence 
on the values of a’ or a” is through the sign of A(c), and disappears if either a’ 
or a” is zero. 

(b) The limiting distribution & does not have to be normal. It is sufficient if 
it is continuous, if @"(a’) = 4’ and "(1 — a”) = 8” are uniquely determined, 
and if there exists e > 0 such that for x « (—e, ©) or e (—®, 6), 


Va a(x) = &(6” — x) — (7 — x) 


is a monotone function of x, converging to zero as |x| — © (the monotonicity 
not ceasing to be strict until the function attains the value 0). Note that if either 
a’ or a” vanishes, all continuous distribution functions @ which are strictly 
increasing for the set of ¢ for which 0 < #(t) < 1 have this property. 

(c) Against a fixed alternative y’ ¢ T near’, for sufficiently large n, the power 
of 3 is approximately 


1 — Wy a(n’? Aly’ — y°)]. 





306 H. S. KONIJN 


We may call this expression the asymptotic power of y’ (near 7’). So if, for y’ 
(near 7°), A(y’ — y°) ¥ 0, 3 is consistent at y’; and if {y(n)} is such that, as 6 
takes on different values exceeding a’ + a”, A(c) differs from 0, 3 is consistent 
near 7°. 


1.3. Some tests for independence. Consider a sample X,, --- , X, (n 2 2) 
from a bivariate population F, where X, (Y., Z.), without ties in the Y, 
or in the Z,. Let Ria) be the rank of Y., Sia) the rank of Z,., and Y and Z 
the sample medians of Y and Z. Define for a, 8, y ranging over 1, --- , 
n—1)(n ei 1 ‘ z " 

(r ) a+) To. = > sen (Ya — Yz) sgn (Za — Z,) 


3n? n? aB.y 


Shi Seite ee) 


n° 


: >, sgn (Ya — Ys) sgn (Za — Zz); 


n(n — 1) a6 


— — > sgn(Ya— Ys) sgn (Za — Z,), 


n(n — 1)(n — 2) axbvrea 
for n > 2, 


0 for n = 2; 


_ (N' = N") + (’ =") 
(N’ + N”) + (n’ 4. n”) ’ 


where N’ is the number of a for which sgn(Y. — Y) sgn(Z. — Z) is positive, 
N” the number for which it is negative, and n’[n”] is the number of pairs (a, 8) 
for which a ~ 8, Y. = Y, Zs = Z, and sgn(Ys — Y) sgn(Z. — Z) is positive 
(negative. 

Tin is the difference sign correlation, 79, the (Spearman) rank correlation, 
T2, the unbiased grade correlation (introduced by Hoeffding [5]), and 73, the 
medial correlation proposed by Sheppard [15] and discussed by Blomqvist 
{1]. (The name medial correlation was proposed in [14].) 

It may be noted that the formulae for 7», and 73, are obtainable from the 
formula for the ordinary correlation coefficient 


R. = > (Ye — Y)\(Ze — ZH (¥. — PYD(Z. — 2)" 


(Y= > Y./n, 2= 35Z./n) 


by substituting ranks for observations and, in the case of 7;, (assuming n even), 
interpreting ( ) as the signum of the quantities in parentheses. Applications of 
these operations to the alternative form [3] 


Ra = (Va — ¥o\(Ze — 2)/X Vu — Ye D (Ze — 2s)}"" 
7 a,8 a,8 





POWER OF CERTAIN TESTS 


gives 7'o, and 7’, , and to the alternative form 


> t.~- ie =~ 3%) 


R, = ; ak al a ianeeioatinenenesastennnens 
{ DY (Ye-—Y¥s(Ya-Y¥>) 2 (Za — Zs)(Za — Z,)} 
ax Beyea aXBeyxsa 
gives To, and T 2, . 

These statistics are discussed in [5] and [1], where the following properties 
are proved for F continuous: 

(1) Ty, has mean 7 = 7,[F] = HE@y(X., Xs), where X,, Xz, is a random 
sample of size 2 and @y(1, 22) = Doigjeas sen(yi — ys) sgn(zs — 2z;). Let 
$y, (xz) = Bb »(x, X), then 7; = HOy(X). One finds (x) = 4F(y, z) — 2F(y, 
2) — 2F(o,2z) + 1,80 that 7, = 4f/f F dF — 1. 7), has the variance 


9 2 


Fin = “__. [2(n — 2){ E@i,(X) — ri} + 1 — ri, 
n(n — 1) 


which is finite and converges to zero with 1/n. In case of independence 7; = 0 
Eb),(X) = 4, and we have 


> 


2(2n + 5) 
9n(n — 1)- 


Cin = 


The distribution of (T;, — 7:)/o1, converges to the normal distribution with 0 
mean and unit variance if H@},(X) ¥ 7}. 

(2) Tx, has mean 72 = 72[F] = H;(X,., Xs, X_,), where X., Xs, X,is a 
random sample of size 3 and #2:(z, , 22, 23) = 4 D> cociete tiers.2.) sgn(y; — y;) 
sgn(z; — z). Let Boo(x; , 22) = Bbas(2; , 22, X), a(x) = BOn(x, X); then tr. = 
E@.;(X). One finds for n > 2 


Doo (2; »%) = 1+ 2F(y., 22) + 2F(y2, a) 
+ {F(y, ©) — Flyz, ©)}{sgn(z: — 22) — 1} 
+ {F(o, 2) — F(o, z)}{sgn(yi — ye) — 1}, 
(zx) l — 4F(y, ©) — 4F(, z) + 4F(y, 0 )F (2, z) 
+4 Fly,2) dF(~,2) +4 | FY,2) dF, @), 


so that 


aie 12 Fly, ©)F(, 2) dF(y, 2) — 3 


= 12 ff tr, ©) — HIF(@,2) — #) aFQ,2), 


called the “grade correlation.”” Lemma 4.1 of [8] proves that this equals 


12 I Fly, 2) dF(y, ©) dF(©,2) — 3 = 12 If (F — Fr) dF. 
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T, has variance (for n > 2) 
2 6 n—3 
on = a 3 
™ n(n — 1)(n — 5| ( 2 
+ 3(n — 3){E@2(Xa, Xs) — 72} +1 —73 


where X, and X, are independent; this is finite and converges to zero with 1/n. 
In case of independence 7. = 0, H@2,(X) = $, Hb2(X., Xs) = ys, and we have 


n ~ 2 . ; 
n(n — 1)(n — 2)’ 


while p(Tin , Ton) — 1, p representing the correlation coefficient so that the asymp- 
totic functional relation 37}, = 27, holds [3]. The latter relation does not hold 
in general in the case of dependence. (T2, — 72)/o2n converges to the normal 
distribution with 0 mean and unit variance if H@3,(X) ~ 73. All future refer- 
ences to T2, , r2 , and o2, will be understood to bear an appropriate qualification 
for n = 2. 

(3) Assume that the medians of the Y and Z populations are unique, and de- 
note them by yu and », and let F (yu, v) be different from 0 or 4. (The other assump- 
tions in [1] can be shown, by use of the Glivenko-Cantelli theorem, to be super- 
fluous; but the condition on F(z, v), not given there, is, in fact, essential.) Then 
the distribution of (3){(N’ — 2n F(u, v)}{n F(u, v){4 — F(u, v)}]” converges 
to the normal distribution with 0 mean and unit variance; so the same holds for 
(Ts, — 73)/o3n, Where 


73 = 73\F] = 4[F(u, v) — F(u, ©) F(™, v)), 
2 16 


Tin = 


7 =f 1 ° 
— F(y, v) {3 = Flu, v)} = (1 = Fa), 
n 7 
as F(u, v) = 4(1 + 73). In any case under Fy, continuous, the distribution of 
T3n is symmetric. 
(4) Ton = [(n — 2) Ton + 37 n|/(n + 1). Moreover, 


cov (Tin, Ton) = Laie 
n(n — 1) 


+ ' E#,o( cd ’ X 3) Poo(X a ’ Xa) —_= Fite i], 


[(n —_ 3) { Bb, (X)b1(X) — T1T2} 


where X, and X, are independent. In case of independence FE ,,(X) .;(X) 
E (Xa, Xs) Pa(Xa, Xs) = $3, and we get 


PATin, Ton) ~ 1. 
The asymptotic distribution follows from this and the remarks under (2). 


In case the variances of Y and Z are finite and positive, one could compare 
these tests with the (parametric) test ® to see whether the correlation coefficient 
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p vanishes. (We shall see, however, that in cases of interest in this paper, p 
frequently does not imply independence.) According to Cramér ([2], pp. 
and 366), ER, = p + O(1/n), and, if the fourth moments are finite, 


E(R, — p)’ = : + O(n), 
n 


with 


z 9 / 
bao Los “22 M31 M13 si M22 
~~ 4 (% >= | ) — p| en ) (M02 M20) V4 - 
: 20 


Moo = M20 Moz P\ un | Mee "M20 oe” 

and the asymptotic distribution of (2, — p)(k/n)“” is normal with zero mean 
and unit variance when the fourth moments are finite and positive and k differs 
from 0. If Y and Z are independent, we get k 1; so that (given positive finite 
fourth moments of X) with respect to classes of alternatives for which (at least 
in the neighborhood of the null hypothesis) k and p differ from 0 and the 
fourth moments are finite, and for which the convergence to normality is uni- 
form, we have 


Ag(c) 


r Cky 
k=l OY 
provided this ~0. As a test for independence against a class of alternatives which 
includes a case in which p vanishes, the ®-test is easily seen to be inconsistent. 
In any case, when finite positive second moments of Y and Z exist, we note that 
ER, = 0 if Y and Z are independent. 


2. General linear transformations. Let X,0 = (Y,0,Z,o) be a pair of indepen- 
dent random variables with (nondegenerate) marginal distributions G and H. 
By F, we shall denote the joint distributions of the pair X, = (Y,, Z,) of linearly 
transformed variables 


Y, = Ar Yno + AoZ yo 
Zy AsVro + AgZyo 


Let A denote the set of those nonsingular linear transformations A, which do 
not consist merely of a change of scale or permutation of Yyo and Z)o, or the iden- 
tity transformation \°. A generates a class 2 of transformations of distributions 
in § = {GH|G, H nondegenerate} and a four-dimensional Euclidean space 
containing \” = (1, 0, 0, 1). 

A transformation \ ¢ A — {\°} nearly always makes the resultant Y, and 
Z, dependent: 

TueoremM 2.1. Jf \ e A, Fy is either normal with p = 0, 


o°(Zy)/o°(¥x) = — (As/d2)/(A1/Aa), 


or a distribution of dependent random variables. 
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Proor. We have to show that if F, is independent, it is normal. In 1948 a 
proof, by Loéve, of a slight extension of the following proposition was published 
in a treatise of Lévy [10]: For a pair U (U,, Us) of independent random 
variables to be normally distributed, it suffices that there exists a nonsingular 
linear transformation A such that V, U1 + AUVs and V2 A3sU, + AgU2 
are independently distributed, and that A,Ac\sA\, * 0. Note that the last condi 
tion is implied by the others when \ ¢ A — {Ad}. The formula for the ratio of 
variances follows from this last condition and the vanishing of the correlation 
of Y, and Z. 

To arrive at the relative efficiencies, we first derive the following lemmas. 

Lemma 2.1. Let G and H possess first moments and densities which together with 
their derivatives are continuous. Let 


1,(d) - I F, dF. 


vd 


I.(A) = i F, dF y(y, ©) dF\(“,z) = I/ Fy(y, ©)F,( 0,2) dF,(y, 2), 
I3(\) = Fy(u, v), 


where F\(u, © ) Fy(@, v) 4. (We omit + from the regions of integration.) 


Then 


al,(\°) — ali(d°) | 0 
dri dX, ' 
a1,(d°) 


= = 2EG’(Y.0) cov {Zy0, H(Zy0)} > 0, 
Ode 


al,(’) 


> 2EH'(Z,0) COV { Y 0 ’ G(Y 0) } > 0, 
Org 


al(r’) _ dl2(d") _ 0 als(d°) _ 1 a1,(d’) al,(r°) _ 1 a1,(d°) 
Ov - Or , Ode 2 Or2 7 OXA3 ys OA3 


which expressions are invariant under a change of origin of Yyo and Z)o. 
If, moreover, G and H are symmetric about their means and their densities do not 
vanish there, 


aI;(d°) — a1s(d°) _ 0 
Ory Ong : 


A13( aie 
ee! = 1G'(EY»)E |Z. — EZ| > 0, 
CA2 


| 


al3(x° ) 


w\\?) = 3H"(EZ,\)E| Yi — EY,0| > 0, 
OX3 


which expressions are invariant under a change of origin of Yyo and Z, 
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Proor. Admissibility of differentiation under the integral sign is demon 
strated in the proof of Theorem 2.2. If the ranges of increase of G and H have 
finite bounds, the continuity of the densities implies that they vanish at the 
bounds, and so the derivatives of these integrals are effectively free of terms 
involving the values of the integrands at the ends of their ranges. Therefore, 
we may as well suppose doubly infinite ranges of increase. 

We note at once that for \.x = As 0, and z 1, 2, or 3, T;(A) is independent 
of \; and Ay, so that 


OI ;(X) 
On; ew AO 


; ~2 ww (MY — A2zZ\ pp, ( —Asy + AZ 
(\) = (det) o (=... — 
O) idet 2) I | ( det » ) - ( det » ) 
? . ts ‘os? (—ee te?) bot saa 
Gg’ | ——__— a 2 ; 
| | ( det - det X qaine 


+ | G" (y)G(y) dy / 2H'(2)H(2) dz 


= 0 for? = 1,2,3;7 


Ol y( dr ) 
OX2 


_ [ ee dy | H'(z) [ 2H'(2) dz dz. 
Integration by parts gives 


[ o"maw dy = Gna | — [ewew dy = —EG'(Y 0), 


[xe | 2H'(2z) dz dz = H(z) . 2H' (2) as| — [ a@'@ dz 


= EZ,. — EZ,, H(Z,0). 
Consequently, 


a1,(a) 


Odo | rmn0 


= EG'(¥y)EZ,0 H(Z,0) — EG’(Y x0) {EZ,0 — EZ,0oH(Z,0) } 


— EG’(Y¥y)EZ,0 + 2EG’(¥y0) EZ,0H (Zo) 
= 2EG'(Y.) cov {Z,0, ¥(Z,0)}. 


If Go and Hp are the distributions of Yo = Yxo — EY yo and Zyo = Zyo — EZ), 


[ HG) / 2H'(2) dz dz = cc — EZ,) / 2H3(2 — EZ,0) dz dz 


= [ H(z) | 3Hi(3) dz dz + 3EZ,0, 
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which by nondegeneracy of Ho is less than }£Z,o. Since 
EZ\oH(Zx0) = EZ,oH(Z,0) — 4EZyo, EGo(Yx) = EG’(Y¥ 0), 
it follows that EZoHo(Z;0) > O and that 


al; () 


= = 2EGo(Yx0)EZj0 Ho(Zx0) = 2ZEGo(Yx0) cov {Zx0, Ho(Zyo)} > O. 
CA2 A= AO 


Similarly, 
d12(A) 
OA3 Awe 9 


I.(X) = | det »\~° | fa’ (™Y ue ue | H’ (gs Az ) ae’ 
k det det A 


Bag 2) we (=But2) ay\ 
[@ ( det X )H ( ak. dy j 


re py - MG —_ A2Z (M8 > Mt) , 
. ~ det X ——s Y az zZ 
J J ( det d ) det A» dy d | au dz, 


= — | G” (y)G(y) dy | 2’H'(z') dz’ | H’(@)H@) dz 


= 2EH'(Z,) cov {Yy0,G(Vx0)} = 2EHo(Zy0) cov { Yxo, Go(¥x0)} > O. 


ah() 
OX2 Awe dO 


-- | G’(y)G(y) dy | G” (y’) dy’ { 2H’) H() dz 


- [ewe dy | He) | 2H'(z) dz dz 


= —1LEG'(Y.)EZ,0 — EG’(¥x0){EZ,0 — EZ,0H(Z,0)} 
aly) 


Od2 haw 0 


1 
2 


and similarly, 
012(d) ] 01,(A) 
OA3 Awe 0 2 OX3 haw 9 : 


Suppose that, moreover, G and H are symmetric about y’ and »’, respectively, 
and that G’(u’) and A’(y’) are different from zero. 
_ . ’ Nay — Az , /—dsy +z 
10) = [det xt? ff a (MY —™?) 2 UT M2) ay de 
) ; det ) ( det » ~ 

Note that uw = Ay’ + Av’, v = Agu’ + Ay’, ef G’(y) dy = Z, I;(d°) : 3; 

al; - Mt F . 

IQ)| / G"(y) dy | 2H"(e) de + Gu!) | HG) de 


Or2 hae 0 


, 
ov 


y’ 


(v’ — z)H"(z) dz 


—G'(u') [ 2H'(2) dz + Wer’) = ov) | 


J 


1G’ (u")E | Zx0 — v’| = 3G"(EYy0)E | Z,0 — EZ,0| > 0, 
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since 
av av’ 


O>| @—v)H'@dze= | cH’) dz - Ww’. 


~ J 


Yyo — pw’ and Zyo Z,o — v’ have the distribution Gp and H, 


0 


—Gi(0) | 2Hi(e) dz = 3G5(0)E | Zho| > 0; 


013(A) 
Odo 


similarly, 


d1s(n) = H’(v’) (u’ — y)G'(y) dy = —H5(0) | yGo(y) dy > 0. 
A3 haw 0 d 

We now obtain the following general theorem (for notations, see Section 1.2): 

THEOREM 2.2. Let A’ denote a set of nonsingular linear transformations of 
Y,o and Zo which do not consist merely of a change of scale or permutation of Yo 
and Zo or the identity transformation »’. 

I(a). Let G and H have first moments and continuously differentiable densities, 
or be limits of such distributions and possess densities. For sequences {d(n)} of 
elements of A’ — {X°} converging to d° for which for each 8 > a’ + a” the numbers 


lim ~/n do(n) = &, lim Yn d3(n) = Cs 


n+ n-»00 
exist and satisfy’ 
co2HG’(Y)0) cov{Z,0, H(Z,0)} + csBH'(Zy0) cov{ Yy0, G(Yy0)} ¥ 0 
we have’, fori = 0, 1, 2, 
3; , applied as an (a’, a” )-level test, is in 0? a(k, A’), 

A,(c) 12c.EG’(Yy0) cov{Z,o0, H(Z,0)} + 12c3FH’(Z,0) cov{ Yo, G(Yy0)}, 
which expression is independent of the means of Yyo and Zo. 

(b). Let G and H be symmetric about their means, have first moments and con- 


tinuously differentiable densities which do not vanish at the means, or be limits of 
such distributions and possess densities. For sequences {(n)} of elements of 


0) 


A’ — {rv} 
. ( . ; 
converging to \ for which for each B > a’ + a” the numbers 


lim Vn A2(n) = ce, lim ~/n A3(n) = cs 


n+ no 


le) 


2 If the (continuous) G and H are defined as the pointwise limits of distributions G 
H‘® with continuously differentiable densities, interpret the functionals of G and H in the 
text as limits with respect to e of the corresponding functionals of G“ and H™. 
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exist and satisfy 
CoG’ (EY)0) E | Z,0 — EZy0\ + c3H’(EZy0) E | Yyo0 
we have 
, applied as an (a’, a” )-level lest, is in pv aw(a, J 
A3(c) 20.’ (EY,0) EB | Z,0 — EZ,o| + 2c3H’(EZ,0) FE | Yyo — FY) 0 |, 


which expression is independent of the means of Yo and Zo. 
(c). Let G, H possess fourth moments and let o(Z,0)/o(Y0) = b. 
, \ Q 0 . - 
For sequences {d(n)} of elements A’ — {dX } converging to X for which for each 


8 > a’ + a” the numbers 


lim Yn A2(n) 2 y lim Yn A3(n 


n-» 0 n-2 


exist and satisfy 


; 


‘2 , " ° . 1 {1 
R, applied as an (a’, a )-level test, is in Py: a2, A), 


Ag(c) = bee + C3; 
; b 
with the latter expression depending on G and H only through b. 
II(a). Let G, H be as in I(a), and suppose that no sequence exists as there de- 
scribed. Let there exist, however, sequences {\(n)} of elements of A’ — {r°} converg- 
ing to »° for which there is a smallest integer p > 1 with 


= [| F, dF 
Ax, ey Ar, J ra C A> ar 


. 0 . 
continuous near \ = dX, with 


oP 


Or, J 


/ F, dFy(y, ©) dF,(, 2) 


1 


lim 


n»® 


existing for each B ” and with 


ap [ ‘ 
-* = —— FP, dFy(y, ~) dF,(@, z) 
Cky De, Are, | » ¢ y ary » ©) 
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different from zero. Then, 


= . ” ; ; ) 
31, applied as an (a’, a )-level test, is in Py?’ 4” ( 


Ai(ec) 


oP 


“ » OK, eee Ok, I| I , dk ly, 0 ) dk af 0,2). 


(b). Let G, H be as in I(b), and suppose that no sequence exists as there de- 
scribed. Let there exist sequences {d(n)} of elements of A’ — {dX} converging to 


0 - ‘ ; ‘ : 
dX for which there is a smallest integer p > 1 with 


oP f , r vr. , r 7 
x, Ton Ore, Fy, Ey »0 oad do EZ, As Ey »0 -b N4 EZ 0) 


° 0 
continuous near X , with 


lim n*?{X(n) — r°} 


n-2® 
existing for each B > a’ + a”, and with 


oP 


FiO, EY) o aa Ae EZ, Az EY 0 ad Na EZ,0) 


Cky 


Or, -** OY, 


different from zero. Then 


om . ; ” . . > 
33 applied as an (a’, a )-level test, is in Py" a ( ht) 
2p 


4 ap 
se “*P OD, , eee Om, 
‘FOL EYy0 + AL EZy0, Aw EYi0 + NY EZ,,). 


(c) Against sequences {d(n)} of elements of A’ — {d°} converging to d° for 
which 
—);(n) / A4(n) 


lim ——- 
emt do(n) A3(n) 


® is not consistent. 





316 H. S. KONIJN 


Proor. First we prove (a) and (b), letting G and H satisfy the conditions of 
Lemma 2.1. Let R(yo , 20.) = {(y, z):y S yo, 2 S 2}, and 


i ff ime =e) (2 =i) 
J = |de G . LE 2) ay de 
J = |det d If . ( Ls) H a dy 


zo) 


= [| G” (y)H'(z) dy dz, 
~“R)(yo.20) 
where (yo, 20) is the corresponding continuous transform of R(yo, 20). Since 
f° §’ G@’ (H(z) dy dz exists for all y, z, including @, there exist y” < y’, 2” < 2’ 
such that 


| G” (y)H' (z) dy dz 
R(yo.2”)UR(y’ 20) 


is arbitrarily small. Let R(y, z) = R(y, z) — Ry, 2”) u Ry”, z), Rly, z) = 
Ry(y, z) — Ry, 2”) u Ry”, z). Since, moreover, for > close to \° there ex- 
ist y’ > yo, 2’ = 2 close to yo and 29 such that 


my — 
ta(yo, 20.) C Ry’, z’), 


° > ° “fe . 0 “4° ° ° 
convergence of J at infinity is uniform in \ near ). Since the integrand is also 


ti 


continuous uniformly in \ over R(y’, 2’) — R(y”, 2”), the integral over 
Ry’, 2’) — Rayo, Zo) 


can be made arbitrarily small, and so also the absolute continuity of J is uni- 
form in \ near \°. For the other integrals arising in the partial derivatives of 
I,, Iz, and I;, we also get these uniform properties, so that the generalized 
Lebesgue convergence theorem is applicable, and the partial derivatives are 
continuous functions of \ near \” and can be obtained by differentiation under 
the integral sign. The continuity in \ of J,, J2, and J;, and of Pe7,(X,) and 
F;,(X,) in which occur expressions such as 


I Fily, ©)Fi(~, 2) | Fug, 2) dFi(y, ~) dF ly, 2), 


follows likewise. The continuity of E®},(x,) and E®},(2,) implies (utilizing the 
results quoted in Section 1.3) that 


2 =. a 0 
Tin(A) ~ { B@s) (X)) — Ti(A) }oin(A ) 
and 


2 vz ()2 7 27 5 2 0 
Oan(A) ee { Hb.) (X)) ™ 72(A)}oon(A ) 
are positive in the neighborhood of \ = ), so that the asymptotic distribution of 
(T in — 7;:)/oin i8 NOrmal with zero mean and unit variance in that neighborhood 
fori = 0, 1, 2. For 7 = 3, this follows at once from the easily verified fact that 
F\(u, v) is different from both 0 and 4 when d is near ar. 
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The approach to normality of 7';, is uniform. For z = 1, 2 (and so fori = 0) 
this follows from the method of proof of the approach to normality, which uses 
the central limit theorem for the identically distributed random variables 
Bi) (Xy a) — Ti(A)\(a = 1, ---, n), and the fact that the asymptotic distribu- 
tions of two random variables V. = n?(i + 1)d0ai1’(Xy.2) and 


Vi S/n {T in( Xx) — 7:fd)} 

are the same when the expectation of the squared difference converges to zero, 
by noting the above continuity property. In fact, E(V, — Vx)’ converges to 
zero uniformly and EV,’ is continuous in \. For i 3, the continuity in Xd of 
F, and its first partial and cross derivatives at (u, v) is also sufficient for the uni- 
form approach to normality. This is so because the asymptotic normality proof 
involves the Glivenko-Cantelli theorem, which holds uniformly when the dis- 
tribution functions are continuous uniformly with respect to the parameter 
(see [13]), and an argument analogous to that of the de Moivre-Laplace theorem 
for binomial random variables with constants equal to the first partial and 
cross derivatives of Fy at (u, v). 

It remains to consider continuous distribution functions G = G®, H = H® 
which are limits of distribution functions G“’, H“’, as mentioned in the theorem. 
Since by Pélya’s theorem the convergence is uniform, given any 7 > O there 
exists ¢) > 0 such that for « < « there is a sphere of F,“-measure bigger than 
1 — » with the property that on its complement C we have (if PF“ = F°? — F°’) 


[ dFy" (y, 2), 


“Ce 


[\FS Gy, 2 dFx” (y, z) 


[ fire, 2) |Fx° (0 , 2) dF” (y, 2) = | dFy" (y, 2), 
“Cc c 


which is less than n, giving the convergence of J;.(A) to I(A) for i = 1, 2 (de- 
fined in an obvious way) by the Helly-Bray theorem; we easily obtain it fori = 3 
as well. Since J;.(A) and J(A) are continuous in \, this convergence holds uni- 
formly in a neighborhood of \° by a slight extension of the reasoning in [7]. 
Similarly we get uniform convergence of EV;, and EV;2 to EVixy and EV33 
and of E(V). — Vx.)° to E(Vx0 — Vio)’, so that the limits are continuous in 
d near d°. Let ” coincide with d° except at the kth coordinate, where it equals 
\, . By the uniformity in \ of the convergence of /;.(A) as « — 0, we have, for 
\ near), 

lim 17 ;(A°)/@% = lim lim [J,(A“) — Tie(d°))/(e — a) 

«+0 €+0 Agra,” 

= lim lim [Ji(A) — Ti(0°))/Ox — de) 


0 
AgwAp €0 


lim [To() ne T o(X°)|/ Og _ rn) = AT o(d°) OAL . 


0 
hpode 
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Finally we have to show that the partial derivative J xo(A) is continuous near 
U . . . 
\ for \ approaching \° along any ray: Letting 

MQ =A +t’ —), ATO =A + tV’ — A”), 


AT p(X) /OX lim 07;.(')/dr lim lim lim A,(e) 


€70 t-+0 AgwdAy 


lim lim lim A,(e) = lim lim A,(0) 


t-+0 Nard €+0 t+0 Nerd; 


> 


lim OJ (ACO) /drx 


where 
[Tie(n(t)) — Tie(A(1)))/ On — Dk 


Now we prove (c). It is easy to see that if the fourth moments of Yyo und Z)o 
are finite, and if 


_ Ar(m)/Ag(n) | a (Z) 
ho(n) /A3(n) o°(Y) 


for all sufficiently large n, the parametric test ®, involving the ordinary correla- 
tion coefficient RF, , can be compared with the above tests. It has (see last part 
of Section 1.3) 
a te a o(Zyo) ial o(¥ 0) a. 
, a(Yx) ~~ a(Zyo) 

whatever be G and H. In fact, one finds k ~ 1 near \ = \°, and on the subset 
S,. of the sample space on which the sample moments involved in the definition 
of R, differ from the corresponding population moments by less than « > 0, 
Tchebycheff inequalities on P,(S,) are satisfied uniformly when the fourth mo- 
ments of Y,o and Z,o are finite, so that (compare [2], p. 366) we can approximate 
uniformly n’*(R, — p) by a linear function of the differences. Moreover, by the 
continuity in \ of EY\Zx, the distribution of these differences is seen to be 
uniformly asymptotically normal. This concludes the proof of the theorem. 

Since the efficiency of 3; depends so strongly on local properties of the density, 
we must conclude that this test is not to be recommended generally. 

In the case of most interest where, except for a change of scale, and/or origin, 
Y,o and Z,o have the same distribution, the condition on the sequences {A(n)} 
does not explicitly depend on the form of this distribution. It is therefore worth- 
while to particularize our result to that case. 

THEOREM 2.3. Let A’ denote a set of nonsingular linear transformations of 
Y,0 and Z,o which do not consist merely of a change of scale or permutation of 
Y,0 and Zyo or of the identity transformation \°. Let there exist a and b such that 


PiZ < t} = G{(t — a)/d}. 





POWER OF CERTAIN TESTS 319 


I(a). Let G have a first moment and a continuously differentiable density or be 
the limit of a sequence of such distributions and possess a density. 


For sequences {X(n)} of elements of A’ — {Xr} converging to X for which fo 


each B > a’ + a” the numbers 


lim ~/n d2(n) ‘9 lim +/n A;3(n) 


nxn noo 


exist and satisfy 


we have. for 7 0.1.2 


2 applied as an (a’, a”)-level test is in Pic 


l . ; , ° s 
A;(e) 12 { be: rj cs) EG’ (Ye) COV Yr GY 0) }, 


which 2s indepe ndent of the mean of Yyo. 


(b). Let G be symmetric about its mean, have a first moment and a continuously 
differentiable density which does not vanish at the mean, or be the limit of such dis- 
tributions and possess a density. 


7 0 . 0 ° e 
For sequences {d(n)} of elements of A’ — {dX } converging to \' for which for each 
8B > a! + a” the numbers 


lim Yn A2(n) 2 lim +/ 7 d3(n) 


nx nwo 


exist and satisfy" 


we have 


n 7 ow . 1 
J3 9 applied as an (a, a” )-level le st, iS In Va'.a*| 


As(c) = 2( ber + + cs) 
\ b “/ 


G' (EY) E\Yy,0 — EY x0 


which is independent of the mean of Yo. 
(c) Let G possess a fourth moment and let o(Z,0)/o(Yy0) b. 


0 . 0 ° 
For sequences {d(n)} of elements of A’ — {dX} converging to \ for which for each 
B > a’ + a” the numbers 


lim s/n Ao(n) = ce, lim a/n A3(n) = 3 


n-» oo nwo 


? If the (continuous) G is defined as the pointwise limit of distributions G“? with a con- 
tinuously differentiable density, interpret the functionals of G in the text as limits with 
respect to « of the corresponding functionals of G‘®. 
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exist and satisfy 


we have 


s . 1 
KR, applied as an (a’, ”)-level test, is in Pa’.a*(4, A’), 


whatever be G. 
Il. For sequences {X(n)} for which 


° — i, (n) Ag(n) 
lim , 
~xo Ao(m)/A3(n) 


such as occurs in sequences of rotations (discussed below), there may still exist p > 1 
for which A,(c) ts well defined for i = 0, 1, 2, or 3 following Theorem 2.211. (See 
Lemmas 2.2, 2.3, and 2.4, below, which also deal with other cases where sequences 
as in | do not exist.) For such sequences R is not consistent; and for G possessing a 
symmetric density and b = 1, the tests 3; (¢ = 0, 1, 2, 3) are not consistent. 

Proor. We only need to prove the absence of consistency of the 3,;, and 
therefore the vanishing of the 7; , under the conditions mentioned. For this pur- 
pose the location of the center of symmetry is immaterial, and we shall suppose 
it and a to be zero. Let f be the joint density of Y, and Z, , and for simplicity 
suppose that 


Ai(n) = u(n) = A, Ao(n) = —As(n) = Ag 
exactly. Then, since G’(t) = G’(—t), 


f (Al + A3 to(t ~ mi) @ (ES) 
¥Y.-2a= / Ao) I eo 9 a 9 
- ae (x1 + 2)" * LOT + 29) 


(2 42) I (i = u#) Q (3 - i) f 
= (A, + Ago) 7.2 a, es 2 = im HF, 
per NOE + DY) Kai abt) ~- , 


which identity is easily seen to imply 7; = 0 for i = 0, 1, 2, 3. (In the case of 
rotations one can prove this result even for the case where no densities exist, 
using Theorem 4.1 of [9].) 

Remark. It may be noted that under the assumptions made on G in the pre- 
ceding theorem under I(b), 


0< cov{ Yyo, G(Y 0)} < 1K Yyo — EFYy0| < 4a(Y 0), 


while, of course, 


cov {¥y0, G(Yy0)} S 40(¥y0)/+/3 


(o(Y 0) need not be finite). 


ExamPLes. The following are some numerical examples of the application of 
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I of Theorem 2.3. Here be. + (1/b)c; ¥ O and A’ = A(c)[be2 + (1/b)e3|~’. Thus 
the asymptotic power of any of these tests against \’ (near \°) is 


8”—L(d’—d9) m1 /2 
—1/2 1/222 
1 — (2m) | a ae, 
3 


*—A(A’—d%)n 1/2 


with the A corresponding to that test. The relative efficiency of two tests is 
found by squaring the quotient of the entries in the corresponding columns: 
G Ao, Ai, A: As An 


normal 3/r 2/x 
uniform 1 1/2 
parabolic 162/175 9/16 
Laplace 36/32 1 


The above theorem depends on development in a Taylor series expansion up 
to the first term of I;(A) about d° for i = 1, 2, 3. In case G and H are the 
same distribution except for a scale or location factor, but be. + (1/b)c; = 0, 
this term vanishes, and we have to obtain second- or higher-order terms. This 
is done in the following three lemmas, the calculations for which are interesting 
but exceedingly laborious. They were carried out by Mr. Arnold Kaplan under 
the author’s general direction; his contribution is here gratefully acknowledged. 
In the derivations, conditions allowing differentiations under the sign of integra- 
tion were freely assumed, and in Lemma 2.2 (2.3, 2.4) the existence and con- 
tinuity of the second (third, fourth) derivative of the density of Y,o and the ex- 
istence of its moments up to the same order was assumed (although this may not 
be necessary). 


Lemma 2.2. Suppose there exist numbers b > 0 and a such that 


H(t) 5 o(5>*), etek ad 


b b 


’ 1 
For i = 0, 1, 2, 3, let A; equal the coefficient of (bc. + b 


A,(c) in Theorem 2.31, and let 1/q = bee(cy — c,). With the notations and condi- 
tions of the previous lemma and further conditions on G implied by the remarks 
immediately above, we obtain, neglecting third-order terms, 


C3) in the expression for 


T,(A) — L(A’) = 2b Aa(Ae — 1) EG’(¥y0) cov{ Yao, G(¥y0)}, 
T(X) — In(d°) = b do(\y — AEG’ (Ya) cov{ Yyo, G(Yy0)} 
— Wrz[cov{ Yy0, G’(¥x0)}}, 
T3(X) — Is(d°) = Bbdo(Ay — A1)G’(EYy0)E| Yyo — EYyo |, 
p(A) — p(d°) = bdro(Ae — Ar), 


so that qA,(c), qAs(c), and qA,(c) are the same as Aj, A; and A, respectively, 
while qAo(c) = qAo(c) is generally different from A; = Ao if G is asymmetric. 
If the distribution of Yo is symmetric about its mean, the covariance Yo and 
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G’(Y,o) vanishes, and the development of J;(A) and of 2J2(A) coincide. If dg, 
\1, the first and third expressions vanish, and p(\) 0. Moreover, if the distribu- 
tion of Y,o is symmetric about its mean, the second expression vanishes as well. 
We therefore investigate third-order terms in 
LemMa 2.3. With the notations and conditions of the previous lemma, we obtain, 
neglecting fourth-order terms, when \, = Ya: 
5 
» 2, 27 +17 Wf, 1tr (=p ~ O anrvlre { 
hts) — OQ’) = —8HrAA1 — I) {BY .G'(¥x0)}< EY.0G’(¥x0) + " EG'(Yx0)>, 
| } 
I2(X) — I.(X) = BAS (Ay -_ 3)[cov{ Yyo ; G’( Yr ») } r 
T3(\) — Is(°) = 30°A30. — 1). 


If the distribution of Yo is symmetric about zero, the first two expressions 
vanish. Further development shows, however, that (with A, ~ 1) (A) — ,(\’) 
does not vanish identically in that case: 

Lemma 2.4. With the notations and conditions of the previous lemma, and sym- 
metry of the distribution of Y,o about zero, we have, neglecting fifth-order terms, 


I(x) — LQ’) = - , A3(A1 — 1) | {G” (y)}? dy{3b-EY xo + a} + BAIA, — 1)’, 
) 


where the first expression on the right-hand side vanishes for a = 0. 


3. Rotations. As a special case of linear transformations, we consider the class 
of rotations. For | 6| < 2/4, write Fy for the distribution obtained from Fo by 
a rotation 


Ay N cos @, Ae —X3 sin 6. 
An immediate application of Theorem 2.2 yields: 


TuroreM 3.1. Let G, H have first moments and continuously differentiable densi- 
ties, or be limits of such distributions and possess densities; and consider null se- 


” 


quences {6(n)} of nonzero angles of rotation for which for each B > a’ + a 


lim Yn sin 6(n) = ¢ 


n--2® 


exists and differs from 0. 


(a). If? 
EG'(Yo) cov{ Zo, H(Zo)} # EH’(Zo) cov} Yo, G(Yo)}, 
then, fori = 0, 1, 2, 
3; , applied as an (a’, a” )-level test, is in 0 o? ao(3, @), 
Ai(c) = 12c[EG’(Yo) cov{Zo , H(Zo)} — EH'(Zo) cov{ Yo, G(Y¥o)}]. 


(b). Let G and H also have nonvanishing densities at the means and be symmetric 
about the means. 


If’ 
G'(EYo)E | Zo — EZo| ¥ H'(EZ)E | Yo — EYol, 
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33 , applied as an (a’, x”)-level test, is in Py’ a*(4, B), 
As(c) = 2c[G’(EYo)E |Z) — EZo| — H'(EZ)E | Yo — EYo\|). 


(c). If EYo < © and EZ < and o(Zo)/a(Yo) = b ¥ 1, but otherwise G 
and H are arbitrary, 


®, applied as an (a’, a” )-level test, is in OO? (3, 8), 


( 


| 
A,(c) = c¢ b — : 


) 
/ 


If (a) EG’(Yo) cov{Zo, H(Zo)} = EH'(Zo) cov{| Yo, G(Yo)} 
or (b) HV (EY )E | Zo — EZo|\| = H'(EZ))E\| Yo — EYo| 


(due, for example, to G = H), one applies II of Theorems 2.2 or 2.3. In genera! 
this proves to be laborious. For sufficiently smooth distribution functions one 
may expect to find p for which A,(c) # 0, since, generally, Fs depends on @. 
However, the remark below Lemma 2.2 implies that ® is not consistent when 
b=1. 

Further properties of the rotations have been studied in [9]. 
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THE EFFICIENCY OF SOME NONPARAMETRIC COMPETITORS OF 
THE t-TEST 


By J. L. Hopass, Jr., anp E. L. LEHMANN! 
University of California, Berkeley 

0. Summary. Consider samples from continuous distributions F(x) and 
F(x — 0). We may test the hypothesis @ = 0 by using the two-sample Wilcoxon 
test. We show in Section 1 that its asymptotic Pitman efficiency, relative to the 
t-test, never falls below 0.864. This result also holds for the Kruskal-Wallis test 
compared with the F-test, and for testing the location parameter of a single 
symmetric distribution. 

A number of alternative notions of asymptotic efficiency are compared in 
Section 2. In this connection, certain difficulties arise because power is not 
necessarily a convex function of sample size. As an alternative to the Pitman 
notion of asymptotic efficiency, we consider in Section 3 one based on the speed 
with which power at a fixed alternative tends to 1. In particular we obtain, for 
the sign test relative to the ¢ in normal populations, the limit as n — « of the 
sequence of power efficiency functions. It is noted that certain interchanges of 
limit passages are not always possible. 


1. Minimum Pitman efficiency of the Wilcoxon and sign tests. For comparing 
the large sample power of two sequences of tests, the concept of asymptotic 
relative efficiency was developed by Pitman [1]. An exposition of his work, to- 
gether with some extensions, was recently given in [2] and [3]. Applications to a 
number of specific problems are made in [4] and [5]. 

Let 8y(@) and 8x(@) denote the power functions of two tests, say A and A*, 
based on the same set of N observations, against a parametric family of alterna- 
tives labeled by 6, and let 4 be the value of @ specified by the hypothesis. We 
shall assume that all tests are at level of significance a. Let 8 be a specified power 
with a < 8 < 1. Consider a sequence of alternatives 6y such that 


(1.1) By(On) — B, as N > «, 


and a sequence N* = h(N) such that 


(1.2) Bx+(On) — B, 
Then if 


(1.3) 
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exists, and is independent of a, 8 and the particular sequences {@y} and {h(N)} 
chosen, then e4+,4 is defined to be the relative asymptotic efficiency of the test A* 
with respect to the test A. Under weak assumptions (1.1) implies that vy — 40 , 
and in the most common cases it turns out that 6, tends to 6 at the rate N”. 
Usually the N observations constitute a sample, or are divided into two samples 
of sizes m and n with m + n = N. In the latter case we assume that m/n tends 
to some limit p, 0 < p < », as N tends to «. In many problems, including 
those we study, e4+,4 is independent of p. 

Pitman gave a method for obtaining the limit (1.3), and evaluated it for a 
number of problems. Consider in particular the case of samples X,,--- , Xm» 
and Y,,---, Y, from continuous distributions F and G and the hypothesis 
H:F = G. We shall be concerned with the narrower alternatives that G differs 
from F only by a shift, so that G(u) = F(u — @) for all u. The discussion applies 
to both the one-sided case 6 > 6) = 0 and the two-sided case 6 ~ 6 = 0. If F 
is a normal distribution, the appropriate test is Student’s t-test. A nonparametric 
test proposed by Wilcoxon is based on the rank sum of the Y’s among the set of 
N-ordered observations. Pitman computed the relative asymptotic efficiency of 
the Wilcoxon test relative to the t-test as 


2 
(1.4) Co, = 120"| f $e) az), 


where f is the probability density of the distribution F, and o’ is the common vari- 
ance of the X’s and Y’s. Some particular values given by Pitman are e,; = 
3/m ~ .95 when f is a normal density, e.,.. = 1 for the case of a uniform distribu- 
tion, and e,., = 81/64 when f(x) = 2’e*/T'(3) for x = 0. All of these values are 
surprisingly high and raise the question as to how low e can actually drop. We 
shall prove, below, the following theorem. 

TuHeoreM 1. Let N* satisfy (1.2) where the tests A and A* are the (two-sample) 
t-test and Wilcoxon test, respectively, for testing against shift of a continuous dis- 
tribution F. Then (a) 

(1.5) lim inf N/N* = 108/125 = 0.864 
N+o 
whatever F may be. 

Furthermore, (b) the lower limit is attained for the distribution with density 
(1.12, 1.13). for which e = .864. 

Proor. It was shown by Andrews [5] that if F is continuous, and 


(1.6) lim / ; [F(z + 6) — F(z)] dF(z) =e, 


then the efficiency, given by (1.3), exists and is 12c’s”. This proof also shows that 
quite generally 


(1.7) lim inf a > 120"| im inf 5 [F(a + 6) — F(z)] ar(e) | 
N+o 4 6-0 


2 
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By Fatou’s lemma, the right-hand side is greater than or equal to 


(1.8) 120° ¢ | E inf F(z + S a ee) dF (x) . 


6-0 
It follows further, from the Decomposition Theorem of De La Vallée Poussin 
(see [6], p. 127), that when F has a singular component, 

lim [F(a + 6) — F(x)]/@ m0 
on a set of positive F-measure, so that (1.8), and hence e, is infinite. We may 
therefore assume that except on a set of F-measure zero, the density F(x) 
F’(x) exists, in which case (1.8) becomes 


2 
(1.9) 120° || f(x) as | ; 


If o = ~, then it follows from (1.6) that e = ©, so that we may assume 
a to be finite. Since (1.9) is invariant under a change of location or sale, we may 
take o = 1, and the problem of minimizing (1.9) then reduces to that of mini- 
mizing 


(1.10) | P(x) dx 


subject to the conditions 


(1.11) | 29) dx =0; | f(z) dx = [ 24) dz= 1; fiz) = Ofer. 


According to the method of undetermined multipliers, it is sufficient to minimize 
/ f(x) + 2b(2? — a®)f(2)] de. 


For nonnegative f, this is achieved by setting 


(1.12) f(x) = b(a’® — 2°), ifa sa 


’ 


and f(z) = 0 otherwise. The constants a and b are determined from (1.11) to be 
(1.13) s=VY5, b=; V5, 


and with these values, (1.9) becomes equal to 108/125, which is therefore a lower 
bound to (1.7). Since for the density (1.12) the limit of (1.6) may be taken under 
the integral sign, it is seen that the efficiency exists in this case, and equals the 
lower bound, which therefore cannot be improved. 

To the extent that the above concept of efficiency adequately represents 
what happens for the sample sizes and alternatives arising in practice, this result 
shows that use of the Wilcoxon test instead of the Student’s t-test can never entail 
a serious loss of efficiency for testing against shift. (On the other hand, it is 
obvious from (1.4) that the Wilcoxon test may be infinitely more efficient than 
the t-test.) 
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It should be mentioned that there are rank tests: that of Fisher and Yates, 
which has been discussed by Hoeffding [7], Terry [8], and Dwass [9]; and that 
of van der Waerden [10], for which the asymptotic efficienty relative to the ¢-test 
is 1 when F is normal, and is conjectured to be >1 when F is not normal. Should 
this be correct, then for these tests the lower bound .864 in (1.5) would be re- 
placed by the even better value 1. 

The conclusion of Theorem 1 also applies to the H-test of Kruskal and Wallis 
[11] for testing equality of k distributions F,, --- , F, , which are assumed to 
differ only in location. This follows from the fact that Andrews’ work, quoted 
above, was carried out for this more general problem, and that in particular 
formulae (1.6) and (1.7) hold for all values of k. 

Another application is to the case of a single sample X,,---, Xw from a 
distribution F(a — @), where F is symmetric about 0. The hypothesis to be tested 
is 1:6 = 0, and if F is known to be normal, the one-sample t-test is appropriate. 
The Wilcoxon test for this problem is based on the rank sum of the positive X’s 
among the ordered absolute values |X|, --- , |Xwy|. Pitman showed that (1.4) 
also applies in this case, and the considerations of Andrews can be used to 
generalize this again to (1.6) and (1.7). 

A particularly simple test of the hypothesis H:@ = 0 in the one-sample problem 
is the sign test, based on the number of positive observations. For asymptotic 
efficiency of the sign test, relative to the t-test, Pitman obtained the result 


(1.14) é1 = 40° f*(0), 


which is valid whenever the derivative Fi = f(0) of F at the origin exists. A 
particular value given by Pitman is e = 2/7 in case of a normal distribution. In 
the present case there is, of course, no positive lower bound, since e = 0 when 
f(0) = 0. If the distribution F is assumed to possess a unimodal density (in the 
weak sense that 0 S |z| < |z’| implies f(x’) < f(x)), then it is easily seen that 
e =. 3, the value } being attained for the case of a rectangular distribution. For 
let f{(0) = 1 without loss of generality, since (1.14) is invariant under a change 
of scale. Then we must minimize 


/ (x — a’)f(x) dx 


subject toO S f(x) < 1, and this is achieved by putting f(z) = 1 when |z| < a 
and f(z) = 0 otherwise. 

It may be questioned whether the high efficiency of Wilcoxon relative to ¢ 
established by Theorem 1 is the result of the particular alternatives considered. 
It is therefore of interest to make the comparison for other than shift alterna- 
tives. We shall now consider what may be called mixture or contamination al- 
ternatives. In the two-sample problem this takes the form 


Fea: Mate 
Y,,°°+, ¥at(l — OF + ©. 
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In the one-sample problem, the form is 
Z1,°°+,Zni(1 — O)F + 6G, F(z) + F(—z) = 1. 


In both cases we take G S F and test the hypothesis @ = 0. 
Mixture alternatives may be reasonable in many situations. For example, a 
treatment may be effective in only a proportion 6 of the population of subjects. 
Thus, cancer operations are effective only if metastasis has not occurred; vitamin 
therapy is useful only if there is a vitamin deficiency. 
If we let (m and) n tend to infinity (at the same rate), while @ tends to 0, with 
F and G fixed, we can compute the limiting efficiency of the Wilcoxon test relative 


to the t-test (or, equivalently, to the test based on Y — X) from Pitman’s for- 


mula 
(80) ux (80) | 
‘i l on( 0 | : 
eee Es rs (O0) 


where 7’'y and 77 are the statistics on which the tests are based and it is assumed 
that 


Tw — uy(8) Tx — py(8) 
on(8) , o% (0) 


have the limiting distribution N(0, 1). If Ty = Y — X and mnT’y is the Mann- 
Whitney form of the Wilcoxon statistic, one obtains 


ox (0) = (1/12)(1/m + 1/n), 
aw(0) o(1/m + 1/n), 


w() = PIX s Y}=(- ¢) | rar +of Faa, 


un(0) = E(Y) — E(X) = o| [ xac o 2 aF | 
It follows that 


\2 \ 2 
| (F —G@) dF / (F — G) dF 
@e= 2 20° 


(1.15) —— — | = 120°} ——_———_—__ ]],, 
| ad (G — F) [@ — G) dz 

where, as before, o is the variance of an observation from F. The equality of 
the denominators in (1.15) follows by viewing each as an expression for the 
(signed) area between F and G. The above computation can be made rigorous 
by the methods of [5]. 

It is clear that {(F — G) dF < }, while {(F — G) dz may be arbitrarily large 
if G is far to the right of #. Thus (1.15) has no positive lower bound, which cor- 
responds to our intuition that the Wilcoxon test, like any rank test, is insensitive 
to the size of large deviations. 
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If we particularize to G(z) = F(x — A), A fixed, we obtain 


(1.16) 120°] / = =. An a *) ar(z) J, 


As A — 0, (1.16) agrees (under suitable regularity conditions) with (1.4), so 
there is no finite upper bound to (1.15). We observe that 


[ tr() — Fe - a) aF(@) = Pr {|X — X| $4}, 


and that Pr {|X, — X,.| < A}/A will be a decreasing function of A whenever 
X, — X,» is unimodal. Thus in particular, if X itself is unimodal, the efficiency 
decreases as A increases [12], so that the performance of Wilcoxon relative to 
t is often less good against contamination with a shift than against shift itself. 
For example, if F is normal and A = ag, (1.16) has the value 0.812; for A 20 
it is 0.533; and it tends to0 as A— ~. 

We remark that the greater sensitivity of the ¢-test to contamination is not an 
unmixed blessing, as the contamination may, in some cases, represent gross 
errors of observation rather than the true effect of the treatment. In fact, in- 
sensitivity to large deviations is one of the advantages of nonparametric tests. 


2. Alternative notions of asymptotic efficiency. The result obtained above 
suggests that if Pitman efficiency is taken as a guide, one may prefer the Wil- 
coxon test to the ¢-test in almost all problems of testing against shift. But how 
reliable is Pitman efficiency? Dixon [13], [14] has emphasized that a comprehen- 
sive efficiency comparison of two tests cannot be made with a single number. 
Suppose that a test A of level a and using N observations has power 8,4(N, a, @) 
against alternative 6. If test A*, also of level a, requires N* observations to 
produce the same power at the same alternative, we define the efficiency of A* 
relative to A in these circumstances to be the ratio N/N*, and denote it by 
e4+,a(N, a, 6). The complete comparison of A* with A would require the evalua- 
tion of this “power efficiency function” for all values of its three arguments. 

We note that the definition of N* just given is not quite complete. There 
usually will not exist an integer N* such that B4+(N*, a, 0) = Ba(N, a, 6), but 
rather an No such that 84+(No, a, 0) < Ba(N, a, 0) < Bas(No + 1, a, 6). Dixon 
suggests that N* be defined by inverse interpolation of 8 4+(N*, a, @) as a function 
of N*; specifically, he proposes polynomial interpolation of N* against 
® '(84+(N*, a, @)) in [13], [14]. We feel that this method, while yielding “‘smooth” 
results, lacks any operational or functional meaning. Instead, we prefer to define 
N* to be No + p, where the test A* has power 8,(N, a, @) if its number of ob- 
servations is randomly chosen with probability p of being No + 1 and probability 
1 — p of being No. Thus, our N* is the expected number of observations re- 
quired with test A* to match the power of test A, when randomizing between 
consecutive integers. (Our definition implies linear interpolation.) 

We note in this connection a curious fact. For some tests, and specifically for 
the t-test against normal shift, 8(N, a, @) is not always convex as a function of NV. 
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Thus, if we wish to attain a stated power for stated a and 6 with smallest expected 
number of observations, we would not randomize between consecutive integers! 


However, as our main objective is to define an N* which gives the desired power, 
and the randomization is introduced only out of necessity, we shall use the defini- 
tion given. 

It might be felt that the question of the definition of N* is too trivial to require 
so much discussion, and indeed if N is large this is so. But efficiency comparisons 
are often made for small N and here (especially with 6 large) the precise defini- 
tion of N* becomes important. To illustrate the point we present, below, the 
efficiency figures given by Dixon [14] for Wilcoxon against ¢ for normal shift of 
amount 6, equal samples of 5, a = 4/126, and the corresponding values as computed 
by our definition. (We are not able to obtain a worthwhile figure for 6 = 4, since 
the value of 8, is not given by Dixon to enough decimal places.) It is seen that 
Dixon’s conclusion that the “power efficiency decreases slightly for more distant 
alternatives” is dependent on his method of interpolation for N*. With our defini- 
tion, the efficiency rises as 


3 .5 1.5 2 2.5 3 3.5 4 4.5 
Bw O72 . 431 .674 .858 .953 .988 .998 .9996 
e (Dixon’s paper) a 96 .95 .94 .94 .93 .92 91 
e (this paper) .968 .978 .961 .956 .960 .960 .964 .976 + .01 


6 is increased beyond about 3, and appears never to fall below about 0.96, while 
the efficiency as computed by Dixon reaches .91 at 6 = 4.5 and seems still to be 
dropping. Similar results hold for the sign test as discussed in Section 3. 

Depending as it does on three arguments, the function e,4+,, is difficult of 
complete evaluation, and interest has centered on finding simpler quantities 
which will serve to represent its general behavior. It is obvious that the Pitman 
efficiency denoted above by e4+,4 is limy+n eas,4(N, a, 04), where 6y satisfies 
(1.1). 

A second kind of efficiency limit is considered by Dixon [13], who evaluates 
for the sign test compared with the t-test the limit e, ,(N, a, ©) (which he denotes 
by E,,). This limit would be of interest if we were concerned with small N, 
moderate a, but 8 very near to 1. (He also obtains e,,.(N, a, 0) and finds that 
limy-+ €:,4(N, a, 0) is, for his problem, equal to the Pitman efficiency. ) 

It is clear that a wide choice of limiting values of e4+,4(N, a, 0) might be defined, 
many of them pertinent in one situation or another. We wish next to call atten- 
tion to one possibility which is in a sense intermediate between those of Pitman 
and Dixon and which seems to help to round out some comparisons. Instead of 
letting 0 — 0 as does Pitman, or 6 — © as does Dixon, we hold 6 as well as a 
fixed and let N — ~. This limit we denote by e4+4(”, a, 0). It is closely related 
to the “index” of Chernoff [15], differing mainly in that Chernoff requires that 
a — 0, so that a and 1 — # remain of the same order. Our limit is presumably 
pertinent when one is interested in large samples and the region of high power, 
but its main interest seems to reside in the fact that it can, in some cases, be 
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computed and serves to give the limit as N — © of sequences of efficiency curves 
of the form computed by Dixon for small N, permitting interpolation for mod- 
erate N. 


3. Limiting efficiencies for the sign test. All of the tests we shall now consider 
(sign, normal, ¢) arise in both one-sided and two-sided versions. However, it is 
true for all of them that as the power tends to 1, the probability of type II error 
for the one-sided test of level a is asymptotically equivalent to that for the cor- 
responding two-sided test of level 2a. The reason for this is simply that the two 
tests have identical critical values, and that one of the two tails in the two-sided 
test is dominant. This consideration simplifies the efficiency comparisons made 
below. 

We are interested in the limiting behavior as N — of the probability of 
second-kind error of the sign test. Suppose X is binomial for N trials with success 
probability p. We may test H:p = r against the alternatives p < r. The test 
accepts if X = ay, whereay = rN —c V/N + dy. That dy is bounded follows 
easily from the fact that the error of the normal approximation to the binomial! 
is of order 1/+/N (see, for example, [16], p. 129). (Using this critical value, the 
level of significance tends to #(—c).) The probability of second-kind error is 
then 


Pia = >» n(x), 


r2ay 


x 


where r(x) = 7 ) p*(1 — p)* 


We can study the behavior of P,; by separately considering the initial term 
(ay), and the ratio of the sum to this initial term. 
Lema 3.1. If N — «© and a/N —r > p, then 
>, x(x)/x(a) > r(1 — p)/(r — p). 
rI2a 
Proor. Since R(x) = x(x + 1)/x(z) is strictly decreasing, [R(a)]° > x(a + c) 
r(a) > [R(a + b — 1)]|°, where 0 < c S b. Summing for 0 S c S b we get 


a+b 


1 — [R(a)}"™ X w(x) 1 — [R(a + b — 1))** 


1 — R(a) x(a) 1 — R(a + b — 1) 
As N — ©,b— ~, and a/N —r, we have R(a) — (1 — r)/r-p/(1 — p) < 1, 
so that the upper bound in (3.1) tends to r(1 — p)/(r — p). If, in addition, we 
require b/N — , the lower bound has the same limit. Since R(z) is decreasing 
for xz > a, we have acai r(x) > ot x (z) — 0, from which the result follows. 
Lema 3.2. If ay rN —ec J/N + dy with dy bounded, then 


(ay) exp [—c?/2r(1 — r)] | (2) aa Ee a 
On) ~ /NV2evV/ 711 — 1) ( (; —r) | Lpi-n 


asN— &, 


(3.1) 
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The proof consists in using Stirling’s formula and simplifying. 
Combining Lemmas 3.1 and 3.2, we see that 


r 1—r 

(3.2) VPu > (7) (-— P) as N- o~. 
r l-r 

Note that the limit depends on the hypothesis r and alternative p, but not on 
a. Since it happens in each of the three problems dealt with in this section that 
~/ Px; tends to a positive limit as N — ~, we shall give to this limit a name, 
referring to it as the base of Py, . The base is essentially the quantity p discussed 
by Chernoff [15]. In fact, the limit in (3.2) is the value obtained in [15] for p in 
the binomial case. However, Chernoff’s p involves a — 0, whereas our a is fixed, 
and as he considers a much more general problem his results are less sharp. A 
similar remark applies to the normal test, below. 

We shall also need the bases for the normal and t-tests. Consider the problem 
of testing that the mean of a normal population of unit variance is zero, against 
the alternative that the mean is 6 > 0. From a sample X,, --- , Xw we compute 
NX = OX; and reject if ~/NX > K, where (K) = 1 — a. The power is 
B(N, a, 6) = 1 — ®(K — ~/N 8). If we fix a and 6 and let N > ~,1— 8 = Py 
is equivalent (in the sense of ratio) to 


(1/ V/N8)-(1/+/2x) exp[—(3)( VN — K)’] 


The limit of ~/Pj; is thus exp[—(4)’], as given by Chernoff. This is our base, 
say by(a, 6), which turns out to depend on 6 but not on a. 

Now suppose that the variance is unknown. We estimate it by s°/(N — 1), 
where s° = D(X; — Xy)ixh-1, form ty = ~WNXw/(s/-/N — 1), and reject 
if ty > Ky, where Ky — K. The power is 

( . sKy } 
Bx() = P\VNXn > Jwail*f 


1 — px(8) = | a ( N 2 5) Pry. (s) ds. 


We first consider an upper bound. Break the integration at (NV — 1)*” to get 
1 — Bx(8) < @(—5-/N + Ky(N — 1)") + Plxwin > (N — 1)”. The first 
term has base exp [— (4)é*] as before; the second has base 0, since (writing N — 
1 = m) 


«© 
—2 2 3/4 1 
[ Cn uu" * exp (—4u’) du < exp (—3m*™*(4/2)” 
m2/3 


«o u m—2 u . 2 _u eo i 
[= (3) exp | -3(J5) ]a(J5) <2 exp (—4m""). 


Take the 1/(m + 1) power and pass to limit to get 0. 








a aie 
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A straightforward calculus argument permits us to verify the 

Lemma. If base {aw} S base {by}, then base {ay + by} = base {by}. 

With the aid of this lemma we see that our upper bound has the same base 
exp[—(4)8’] as does P,, for the normal test. But since the normal test is more 
powerful than the ¢, it follows that the base of the t-test is also Py, . 

We now apply these results to make limiting efficiency statements for fixed 
5 with N — «. Suppose, for a standard test, that ~/1 — B(5) > A(6) as N > @©; 
while for a second test, suppose that ~/1 — 8*(5) — A*(8). If we define N*(N) 
as in Section 2, it is easy to see that N*(N)/N — (log A*(5))/(log A(6)). Thus, 
for the t-test compared to the normal, e;,,(©, a, 6) = 1 for all a, 6. It follows that 
the comparison of sign to ¢ will be the same as that of sign to normal; and as the 
latter is simpler, we shall examine it. 

Let X,,---, Xw be a sample from a normal population of unit variance. 
We may test the hypothesis that E(X,) = 0 against the alternative that E(X;) = 
5 > 0 by using Xy , in which case ¥/1 — By(5) — exp [—(})6*] as seen above. We 
could also employ the sign test, rejecting the hypothesis if too many of the X; are 
positive. The number of positive signs is binomial, with p = 4 under the hypothesis, 
p = (5) under the alternative. Therefore, for the sign test, we have from (3.2) 
the base 24/(8) [1 — #(5)]. Thus for the sign test relative to the normal (and 
hence to the 2), 


(3.3) e(o,a,5) = sh t sot a2 — Ss 


This quantity is seen to be independent of a but dependent on 6. As 6 — 0, 
e(~, a, 5) — 2/x, which agrees with the Pitman efficiency. As i — ~, e(@, a, 8) 
— 4. A few values of (3.3) are shown in the table. It is notable that (3.3) is very 
flat for 6 in th~ range of interest, thus giving results in good agreement with 
those obtained from the simpler Pitman limit. 











TABLE 

3 1 — (8) | €2,¢(0, a, 6) 
0 50 .637 
. 253 40 .636 
. 524 .30 .634 
1.645 .05 .614 
3.090 .001 .578 
3.719 .0001 .566 


es) 0 .500 


The curve (3.3) may be regarded as the limit as N — © of the power efficiency 
function, values of which for the sign test relative to the t-test have been given 
by Dixon [13]. It appears from Dixon’s charts that for fixed a, the actual power 
efficiency curve decreases smoothly toward its limit (3.3), making it possible 
to interpolate for intermediate N and thus to obtain rough values of the power 
of the t-test from binomial tables. 
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As a curiosity we finally examine the limit of e(N, a, 6) as 6 — © with N, 
a fixed—a limit which Dixon denotes by Z, . Our tool is an analog of basis: 
we recall that for the normal test, 1 — By(5) = exp[—(4)N6'|-f(N, a, 6), where 
log f(N, a, 6) = O(N8’). Restricting ourselves to even N, we can use formula 
(10) of [17] to show that for the ¢-test 


1 — By(8) = exp[—(1 — x)N8'/4]-g(N, a, 5), 
where again log g(N, a, 6) = 0(N&) and z is the a-point on the beta distribution. 
LemMa 3.3. Suppose that two tests A and B are available for each sample size 
N, and that 
1 — B,(N, a, 6) = exp (—ayd)f(N, a, 8), 
1 — B3(N, a, 6) = exp (—bNS )g(N, a, 5), 
where log f(N, a, &) and log g(N, a, 6) are o(8°). 

Suppose that Bs(N, a, 5) is strictly monotonely increasing in N, §Be(1, a, 6) = 
a, Bs(N, a, 8) ~1l as N > ~. Then e4.,(N, a, ©) = (No + 1)/N, where No 
is the greatest integer less than ay/b. 

Proor. We shall first assume that ay/b is not an integer, so that there exists 
an integer No with No < ay/b < No + 1. Examining the ratio [1 — 84(N)]/ 
[1 — Bs(m)] = [f(N)/g(m)] exp [(bm — ay)6*], we see that for all sufficiently large 6, 

Ba(No, Qa, 5) < BAN, a, 6) < Ba(No + l, a, 6). 
Recalling our definition of efficiency, we see that if p(é) is defined by 
(3.4) p(5)Bs(No, a, 6) + [1 — p(6)]Bs(No + 1, a, 6) = Bal(N, a, 8), 
then 


No +1- p(s) 
N ‘ 


€a.n(N, a, 5) _ 


If we solve (3.4) for p(é) and let 5 — ~, we find that p(é) — 0. Therefore 
ea,a(N, a, ©) = (No + 1)/N. 
In the remaining case, in which ay/b is an integer, let No + 1 = ay/b. A 
similar analysis then produces the same limiting formula. 
It is convenient to introduce the convention that [u] denotes the greatest 
integer less than wu. Then we see that 
7 vl41 


‘ 2 
€:g(N,a, ©) = = N J 
It is notable that this limiting efficiency is a discontinuous function of a. Given 
any N, there exists an ao(N) such that for a < ao(N), e:,.(N,a, ©) = 1. But if 
we fix a and let N — ~, e,,(N, a, ©) tends to a limit less than 1. Thus, 
lim lim e,,(N, a, 6) ¥ lim lim e,,,(N, a, 4). 


N+ 5-0 b+c0 N+o 
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Now consider the behavior of 8,(N, a, 6) as 6 — ©. For an individual obser- 
vation, the probability p of a positive sign is (6) ~ 1 — (1/6)¢(é) for large 6. 
Examination of Lemmas 3.1 and 3.2 shows that the assumptions of Lemma 3.3 
are met by the sign test with a,, the critical value for the number of positive 
signs. Therefore 


é...(N,a, 0) = =— 


This formula is not comparable to the E, of Dixon, since our definition of N* 
is not the same as his. 
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ON REGULAR BEST ASYMPTOTICALLY NORMAL ESTIMATES' 


By Cain Lone CHIANG 
University of California, Berkeley 


1. Introduction and summary. This study was initiated in connection with 
estimating parameters involved in a certain stochastic process of population 
growth. Because of the nature of distribution functions arising in such studies, 
the usual methods of estimation result in formulas which are so complex that it 
is difficult, if not impossible, to obtain explicit solutions for the estimates of the 
parameters. Investigation of the problem led to an extension of the method of 
best asymptotically normal estimates developed by Neyman [1]. The estimates 
derived are termed regular best asymptotically normal estimates (RBAN esti- 
mates). This extension can be applied to other problems. 

In [1], Neyman considers a whole class of estimates which possess the proper- 
ties of consistency, of asymptotic normality, and of asymptotic efficiency, and 
he provides estimates having these asymptotic properties for the case of multi- 
nomial distributions. His method is extended in the present paper to a more 
general case in which random vectors are dealt with. Such an extension was 
considered by Barankin and Gurland [2], who studied a large class of estimates 
and showed that if the distributions involved are members of Koopman’s family, 
it is still possible to reach the Cramér-Rao lower bound. 

The purposes of the present paper are to discuss a subclass of the estimates 
considered by Barankin and Gurland and to present simple methods of generating 
such estimates. The estimates discussed are based on a number of independent 
random vectors whose distribution functions are not specified. It is proved that 
under certain regularity conditions, the regular and consistent estimates obtained 
are asymptotically normal as the number of random vectors tends to infinity. A 
necessary and sufficient condition for a regular and consistent estimate to have 
a “minimal” asymptotic covariance matrix is given. An expression is derived 
for the “minimal” asymptotic covariance matrix. It is also proved that if a func- 
tion f satisfies certain conditions, then in order that f(6) be an RBAN estimate 
of £(0) at £(6°), where 6° is the true value of the parameter point 6, it is necessary 
and sufficient that the argument 6 be an RBAN estimate of 6 at 6°. Methods of 
generating RBAN estimates are given. 

For simplicity of presentation, matrix notation is used throughout this paper. 
By derivatives of a matrix with respect to a vector (or with respect to a second 
matrix) is meant the derivatives of the matrix simultaneously with respect to 
all the components of the vector (or all the elements of the second matrix). 
The usual rules of differentiation with respect to vectors are used. 


Received March 14, 1955. 
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2. Assumptions and definitions. Let 
Za = (Za,Za2,; -9¢ Bad, fora = 1,2,°->,™, 


be a sequence of independent random vectors taking their values in an n-dimen- 
sional Euclidean space R, . Let 


6 = (0,,-** , 0)’ 


be a parameter point ranging through a subset @ of an s-dimensional Euclidean 
space. The true value of the parameter, denoted by 6°, is assumed to be the 
center of a nondegenerate s-dimensional sphere contained in @. For each 6 ¢®@ 
and for each a, it is assumed that the vector Z, either possesses a probability 
density or is discrete. The density or frequency function of Z, will be denoted 
by pa(z; 6), depending on @. Let %.(@) be the expectation of Z, and let %,,(0) be 
the average of the expectations, i.e., 


En(0) = .» {4(0). 
a | 
For the sake of simplicity in further formulas, whenever the parameter takes on 
“| ye ‘ 0 
the true value 6, f, will be written for @,(@). 
The following assumptions will be made throughout the paper. 
AssumpTIon 1. The second central moments 


(1) C2q;%a; = E\(Zai — oo .\(Ze; — $2;)|0 = @}, fora =1, 2, +++, m, 


are finite and the matrix 
(2) on = > O24iZaj |\ tJ = 1,2, --+,n, 
Mm aol 
tends to a positive definite matrix é* as m tends to infinity. 
AssuMPTION 2. Let 


> (Zai — fas)’. 


t=] 


Then, for every « > 0, 


lim + >> [ p Pa(z; @°) dz = 
moo M a=l /\p|>e./m 

Assumption 3. As m — ©, %,,(6) tends to %(@) in such a way that 
—/m |&m(®) — %(@)| tends to zero. Let @ = (01, 0,°-:, %)’,forlskss. 
The function {(@) has continuous second partial derivatives with respect to ® , 
and the matrix 


(3) v.(0) = %@ 


. 
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has rank k in the neighborhood of the true value 6° 
every k, 1S k Ss. 

Constant use will be made of the following definitions. 

DEFINITION 1. Let {6,.(Z; g RES y : m= 1, 2,.---, be a sequence of 
functions of the observations, taking their values in the s-dimensional space 
containing ®. The sequence {6,,} will be said to be a consistent estimate of the 
parameter point 6 at the true value @° if, as m — ©, 6,, tends in probability 
to 6°. This means that for every « > 0 and » > 0, there exists a number m,,, 
such that m > m,,, implies 


0 0 


(0; , O02, °°°, 6°)’ for 


Pr {|6,, — @| > «€|6 = @} < ». 


DeFInITION 2. Let B be a positive definite matrix such that as m — , the 
distribution of +/m B™(6,, — 6°) tends to a multivariate normal distribution 
with a mean zero and a covariance matrix identity; then for 6°, the estimate 
6,, is said to be consistent and asymptotically normal and m™BB’ is said to be 
the asymptotic covariance matrix of 6,, . 

Let Z,, = 1/m es Z. be the average of the vectors Z, and let S,, be the 
sample covariance matrix. 

DEFINITION 3. An estimate 6,, is said to be regular if 

(i) for every m = 1, 2,---, the function 6,,(Z:, Z:,--- , Zm) is either a 
function of Z,, or a function of Z,, and S,,, but it does not depend explicitly 
either on m or on the individual vectors Z, ; i.e., 6n(Z:, Zz, --* , Zm) = 6(Zm) 
or 6n(Zi, Ze, *+*, Zm) = 6(Zm, Sm); and 


(ii) 6(Z,,) has continuous first partial derivatives with respect to Z,, when the 
estimate is a function of Z,, ; or, 6(Zm, Sm) has continuous first partial deriva- 
tives with respect to Z,, and S,, when the estimate is a function of Z,, and S,, . 

Since the theorems in the following section will be concerned mainly with the 
derivatives of 6,, with respect to Z,, , only 6(Z,,) will be used in Section 3. 


3. The main theorems. The purposes of this section are to show that regular 
and consistent estimates are asymptotically normal, to derive a necessary and 
sufficient condition for an asymptotically normal estimate to have a ‘‘minimal” 
asymptotic covariance matrix, and to give an expression for the “minimal” 
covariance matrix. 

The following well-known lemmas are stated in appropriate forms. 

Lemna 1. Let X,, and Y,, be s-dimensional random vectors and let Z,, be an 
n-dimensional random vector satisfying the relationship 


Xm _ Dain + Es 


where D,», is an s X n random matrix. Suppose that asm — ~, D,, tends in proba- 
bility to a matrix D with constant elements, Z,, has a limiting distribution, and 
Y,, tends in probability to zero. Then X,, has the limiting distribution defined by the 
relation X = DZ, where Z has the limiting distribution of Zn . 
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Proor. X,, may be written as 
X, = DZ, + (D, — D)Zn + Yn DZmn + Um. 


According to a theorem by Slutsky [3] (see also Cramér [4], pp. 254, 299; Neyman 
{1], p. 245), U,, tends in probability to zero as m — «. Furthermore, DZ,, has 
the same limiting distribution as DZ. The lemma follows from a second appli- 
cation of Slutsky’s theorem. 

Lemma 2. Let Assumption 2 be satisfied. As m — «, Z,, tends in probability 
to ©, and the distribution of ~/m(Zm — ©) tends to an n-variate normal distribution 
with a mean zero and a covariance matrix 6*, as defined in Assumption 1. 

The lemma is a consequence of the central limit theorem (cf. [5], p. 113) and 


Lemma 1, since ~/m(Zn — &°) = Vm(Zm — Em) + mlm — ©). 


Let 
A ll deal, kmw1.2---.6, ¢€=@i12--- an 


be an s X n matrix of rank s S n. Let A, stand for the row vector of the matrix 
A, for k 1,2, +++, 8. 


’ 


THEOREM 1. Under Assumptions 1, 2, and 3, the random vector 
(4) Vm Xm = Vm A(Z, — 2) 


has an s-variate asymptotically normal distribution with a mean zero and a covari- 
ance matrix Aé* A’. Moreover, asm — ©, X,, tends in probability to zero. 

Proor. According to Definition 2 of the asymptotic covariance matrix, it is 
to be shown here that there exists a positive definite matrix, B, say, such that 
the quantity +/m BA(Z, — 2) has a limiting distribution which is normal 
with a mean zero and with a covariance matrix identity. Since é* is a positive 
definite symmetric matrix, and since A has rank s, Aé*A’ is also a positive definite 
symmetric matrix. Hence, there exists a unique positive definite symmetric 
matrix, B, such that B’ = Ae*A’, or equivalently, 


B(As*A’)B = I, (s X s identity matrix). 


Because of Lemmas 1 and 2, ~/m B“A(Z,, — @’) has the same limiting distribu- 
tion as BAY, where Y is normally distributed with a mean zero and a covari- 
ance matrix é*. Consequently, +/m B'A(Z,, — 2%) is asymptotically normal 
with a mean zero and with a covariance matrix B-'Aé*A’B™ , which is an 
identity matrix. Thus, by Definition 2, the asymptotic covariance matrix of 
Vm A(Z, — @) is BB’ = Ac*A’. 

The convergence of X,, to zero follows immediately from the equation 
X,, = A(Z,, — @°) and from the fact that Z,, tends in probability to @° (Lemma 2). 

THEOREM 2. Suppose that 6(Z,.) is a regular and consistent estimate of ® at 
0°. Then, 

(i) 6(¢°) = 0° with a probability tending to one, and 
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** / A 0 ° ° . . . . 
(ii) Wm [6(Z,,) — 8°] has an s-variate asymptotically normal distribution with a 
. ° / . 
mean zero and a covariance matrix Aoé* Ay , with 


06(Z,,) 


(6) a dZm 


Proor. Since Z,, tends to e 


lim 6(Z,,) = 6(2°), 
mo 
and since 6 is consistent, 6(Z,,) tends also to 6°; part (i) follows. 
According to Taylor’s theorem, 


6(Z,) — 6(¢°) = AK(Z,, — &) 
: with 
a6(Z,,) 


em 
EY Al 0°48. (Zm 7”) 


where 6,, is an n X n diagonal matrix having all its diagonal elements between 
zero and unity. As m tends to infinity, ¢° + 6,,(Z, — ¢) tends in probability 
to @’. Since the derivatives of 6 are assumed to be continuous, A%, tends in proba- 
bility to Ap . Consequently, +/m [6(Z,,) — 6°] has the same limiting distribution 
as ~/m Ao(Z,, — @) (Lemma 1). The rest of the proof follows from Theorem 1. 

Coro.uary. Let 0 be the kth element of the vector 6. Suppose that 6,(Zn) is a 
regular and consistent estimate of 0, at 6; , then, 

(i) 6.(<°) = 6; with a probability tending to one, and 

(ii) Vm [6(Z..) — 0%] has an asymptotically normal distribution with a mean zero 
and a variance A,6* A, , with 


86,(Zm) 

OZm ¢° 

The corollary, which is a direct consequence of Theorem 2, may also be verified 
by considering the following equation: 


b(Zm) — (2°) = At(Zn — 2), 


A, = 


with 


a 06.(Zm) 
OZm | C+8nlEm—o) 


At 


Here, the diagonal matrix 5,, has the same meaning as defined in Theorem 2. 
DeriniTion 4. Let @* be a class of symmetric positive definite matrices of 
rank s. A matrix G ¢ @* is said to be minimal with respect to C* if, for every 
H <¢*, the difference H — G is positive semidefinite; i.e., for any 1 X s row 
vector u and for any H ¢ @*, the quadratic form u(H — G)w’ is nonnegative. 
Let © be the class of matrices which are covariance matrices of the limiting 
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distribution of +/m [6(Z,,.) — 6] for some regular and consistent estimate 
6(Zn). 

Derinition 5. A regular and consistent estimate 6(Z,,) is said to be regular 
best asymptotically normal (RBAN) if the covariance matrix Aos* A> of the 
limiting distribution of ~/m [6(Z,,) — 0°] is minimal with respect to the class @. 

Tueorem 3. Let 6(Z,,) be a regular and consistent estimate of 6°. Then, 

(i) in order for 6(Z,,) to be RBAN, it is sufficient that the matrix 


die 06(Z,,) — 
8Zn | Zmm% 
satisfy the condition 
(6) Ay = Co'Voe*™™, 
where Cy = Vos* 'Vo and 
Bhs av(6) 


0(6) 90° ; 


(ii) the corresponding “minimal” asymptotic covariance matrix of 6(Zm) is 
given by 
’ 


- i -1p—l 
(7) 65 = m Cy; 


(iii) condition (6) is also necessary if there exists a regular and consistent esti- 
' . ‘ . =2 sentd 
mate having the asymptotic covariance matrix m~ Cy’. 
Proor. If (6) is true, then 


Aoo* Ay = (Co'Vo6* ')8*(Co' Voe*)’ 
= Cp'Voe* ‘6*6*"VoCo" 
= Cp'Vo6e*"VoCo’ = Ca’. 


Thus, the second part of the theorem is an immediate consequence of the first 
part. 


The proof of the first part rests upon the fact that for any regular and con- 
sistent estimate 6, the equation 6(¢°) = @° holds with a probability tending to 
one, and this implies that 


a6le(o)] | 0¢¢0) 


= I,, 
ae ze’ 3(6) | & 


which can be rewritten as 
(8) AoVo == i. ° 
The asymptotic covariance matrix of 6, by Theorem 2, has the form, 


(9) 6, = m Aod* Ab . 


ited 
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To minimize Aoé* Ay subject to condition (8), introduce the Lagrange multiplier 


a || Kal, kh Lay =", @, 
differentiate 
Aoé* Ay — 2a(VoAo — I.) 
with respect to Ao , set the derivative equal to zero to get the equation 
(10) Aos* — aVo = 0, 
and solve (10) for Ao, 
(11) Ay = aVis* 


Substituting (11) in (8) gives 
aVi3* “Vv. 
i.e., aCo = I, , and hence 
(12) a = Ca’. 
It follows from equations (11) and (12) that 
(6) Ay = Co'Voe*™. 


It is easy to verify that a regular and consistent estimate satisfying equation 
(6) will have the property of bestness as defined in Definition 5. Suppose that 
6(Z,,) is any regular and consistent estimate of 6 at 6°, and let A» be the cor- 
responding derivative taken at ¢°; then Ao also satisfies equation (8). If Ap does 
not satisfy condition (6) but Ao does, then the difference (Ap — Ao)é*(Ap — Ao)’ 
is positive semidefinite. Since 


(Ay — Ao)¢*(Ao — Ao)’ = (Ay — Co'Voe*")6*(Ay — Co’ Voe*")’ 
Ayo* A’ — Co’ Vie* '6* Ay — Aysé*s* Tt" 
+ Cp’ Vi6* '6*6* 'V,Ce° 
AeA, — Cl, — LC, + CGC 
Aos* Aj — Aos*As , 


the difference Ayé*Aj — Aoé* Ag is also positive semidefinite. The result follows 
from Definitions 4 and 5. 

To prove part (iii), suppose that there exists a regular and consistent estimate 
6 whose derivative taken at %°, Ao, satisfies condition (6). Let 6 be any other 
regular and consistent estimate of 6 at 6° and let Ay be its derivative taken at 
¢’. In order that the asymptotic covariance matrix 63 = m 'Ayé*Aj of 6 be 
minimal, it is obviously necessary that 


Avs* Aj = Aos* Ao , 
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we 
+ 


which implies that the equation 
(13) (Ay — Ao)é*(Ay — Ao)’ = O. 
Equation (13) holds only if Ay — Ao = 0, ice., 
Ay = Ayo = Cy'Voe* 

proving part (ili) 

It will be shown in the next section that there exist estimates of 6, regular 
and consistent, whose asymptotic covariance matrix is equal to m™ Cp". 

teEMARK. When a regular and consistent estimate 6 is a function of both 
Z, and S,,, i.e., when 6 = 6(Z,, S,,), a sufficient condition imposed on the 
derivatives of the estimate with respect to Z,, and S,, taken at @° and 6, i.e., 
that the estimate have a “minimal” asymptotic covariance matrix, can be de- 
duced by a similar approach. The condition so obtained will be similar to, though 
not the same as, (6). Both the condition and the corresponding ‘“minimal”’ 
asymptotic covariance matrix of the estimate will involve the covariance matrix 
between Z,, and S,, . If the covariance matrix between Z,, and S,, is unknown, 
it is tedious to obtain estimates having such a “minimal” asymptotic covariance 
matrix. On the other hand, a condition imposed on the derivative of 6(Z,, , Sm) 
with respect to Z,, taken at ~’ and 6* will be the same as (6), and the correspond- 
ing asymptotic covariance matrix of the estimate will be equal to m™ Co", if the 
variation of S,, is neglected. Such negligence is, in a way, not desirable. The 
essential purpose of this study, however, is not only to deduce a necessary and 
sufficient condition for an estimate to have “minimal” asymptotic covariance 
matrix, but also to generate estimates satisfying such a condition or having 
such a “minimal” asymptotic covariance matrix. Therefore, it seems to be justi- 
fied to content oneself with condition (6) and with estimates having the asymp- 
totic covariance matrix given by equation (7) (see Section 4). 

Corotuary 1. Let 6, be the kth element of the vector © and 6,(Z,) be a regular 
and consistent estimate of 0, at 6; , fork = 1,2, ---, 8. Then, 

(i) in order for 6(Zm) to be RBAN, for every k = 1, 2,---, 8, it is sufficient 
that the matrix Ao satisfy the condition 


(6) Ay = Cy'Vos*'; 


, 


(ii) the minimum asymptotic variance of 6.(Zm) is given by 
e2 —] - 
(14) 5g, = m “4,Coe, 


where =, = (0,---,0,1,0,---,0) as an 1 X 8 row vector with the elements zero 
except the kth element, which is unity. 

Proor. In order to prove (i), it is adequate to show that for every 
k 1, 2, --+ , s, a sufficient condition for 6,(Z,,) to have minimum asymptotic 
variance is implied in (6). Since 6 is regular and consistent, the corollary to 


eee ae 
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Theorem 2 gives 

(15) 6.(¢°) = 6 
and the asymptotic variance of 6, , 


56, = m* A.6* A, . 


By using a similar approach employed in the proof of Theorem 3, minimization 
/ . 
of A,é* A, leads to the equation 


(16) A, = &,Co' Voe*™. 


Thus, equation (16) is a sufficient condition for 6 to have the minimum asymp- 
totic variance. Because e,Cp'Vo6* is the kth row of the matrix Cp’ Vo6*", (16) 
is the same condition which was imposed on the kth row of matrix Ay. This 
means that equation (6) implies equation (16), for every k = 1, 2, --- , s; thus 
proving part (i). It is of significance to note that whereas condition (6) implies 
the entire set of s equations (16) for k = 1, 2,---, s, the entire set of the s 
equations (16) also implies equation (6). 

Part (ii) of the corollary can be shown by substituting (16) in the expression 
A,6*A, . Simple computation gives e,Co'e,. Equation (14) follows. The right- 
side member of (14) is identically equal to the kth diagonal element of m™'Co’, 
the asymptotic convariance matrix of 6. 

The significance of this corollary is that when all of the components of the 
vector 6 are estimated simultaneously, each of the individual estimates will 
have the minimum asymptotic variance. 

Corotiary 2. Let 6(Z,,) be a regular and consistent estimate of @ at 8°. Suppose 
that the components of the random vector Z,, are statistically independent. Then 

(i) in order for 6(Z,,) to be RBAN, it is sufficient that the matrix Ao satisfy the 
condition 


Ay = Go'VoD*", 
where D* is the limit of the diagonal matrix D?, with diagonal elements 
l m 


2 , , 
C24; > jori = 


m aan 
and Gy = VoD*"W ; 
(ii) the “minimal” asymptotic covariance matrix of 6(Z,,) is given by 


(17) 6 = m ‘Go. 
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The corollary is a direct consequence of the substitution of oz,,z, 0 for 

1 ~J;1,7 = 1,2, --- , nin Theorem 3. 
Coro.uary 3. Let 6,(Z,,) be a regular and consistent estimate of 6. . Suppose 


that the elements of the random vector Z,, are statistically ind pendent. Then, 
(i) an order for 6;(Zm) to be RBAN, it is sufficient that the vector A, satisfy the 
condition 


=} / al] 
A, 2,.Go V,D* . 
where D* and Gp are defined as in Corollary 2; 


(ii) the minimum asymptotic variance of 6, 18 given by 


2 —1 —i/ 
o¢, =m eGo eX. 
The corollary is a direct consequence of the substitution 03.8: = 0, for 
t ~ 7;1,7 = 1,2, --- , n, in Corollary 1. 
It is clear that an estimate having the minimal asymptotic covariance matrix 


in the sense of Definition 5 has a remarkable property of bestness, at least 
asymptotically. To make this point more apparent, the following general theorem 
is given. 

TuEeoreM 4. Let 6 be an RBAN estimate of 6 for values of © in a neighborhood 
v of ©. Let £(@) be a function of ® with its range in a Euclidean space. 

(i) If £(6) admits continuous partial derivatives @ in the neighborhood of 6, 
then (6) is an RBAN estimate of £(6) for 6 € v. 

(ii) If the matrix 

of 


00 


o= 


6 


has rank s, then in order for £(@) to be an RBAN estimate of £(0), it ts necessary 


and sufficient that the argument ® be an RBAN estimate of ®. 
Proor. Let h(Z,,) be any other regular and consistent estimate of f, and let 


dh(Z,,) 
} 
As in the proof of Theorem 3, we can show that H must satisfy the relationship 


(18) HV, - o, 


H 


where Vj is the derivative of ¢(@) taken at 6 = 6°, as defined in Theorem 3. 

Since 6 is RBAN, the limiting distribution of +/m ‘€[6(Z,,.)] — £(0°)} is normal 
with a mean zero and a covariance matrix @Aoé* Aco’. Similarly, the limiting 
distribution of +~/m [h(Z,,) — £(6°)] is normal with a mean zero and a covariance 
matrix Hé*H’. To show that f(6) is best, it is sufficient to show that if H satisfies 
(18), then the difference He*H’ —  Aos* Aco’ is positive semidefinite. Let 


= = (H — oA:)6*(H — Av)’ 
Ha*H’ - © Aos*H’ — He* Ago’ -+  Aos* Agd’. 
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Replacing Ap by Co’ Vos* 
Ho*H’ — oCp'ViH’ — HV,Co'd’ + @Co'g’, 
or, owing to (18), 
= = He*H’ — oCo'o’ = Ho*H’ — }Aoé* And’. 
Since x is obviously positive semidefinite, part (i) is proved. 
To prove part (ii), let 
_ 
he Ie’ 


and consider the matrix (6A) — Ao)é*(@Ap — Ao)’. Using the relations 
Aos* = Co'Vo and AoVo = I Ao Vo , simple computation leads to the equation 


(Ay — pAo)6*(@Ay — Ao)’ = bAos* Ard’ —  And* Acd’. 


The last difference, then, is positive semidefinite unless @Ap = Ao. Since @ has 
rank s, this implies that Ay Ay, hence the necessity of the condition. The 
sufficiency follows from part (i). 

Corotuary. Let the random vector (6:(Zm),--- , 6(Zm))’ be a regular and 
consistent estimate of (0; , --- , 0+)’, forr < s. Then, 

(i) in order for the random vector to have “‘minimal’”’ asymptotic covariance matrix, 
or for the elements 6.(Zm), for k = 1, 2,---, r, to have the respective minimum 
asymptotic variances, it is sufficient that 


o 
: =~at; Ve. 
A, 


being anr X s matrix; 
(ii) the minimal asymptotic covariance matrix of the random vector is 


md,C's5. . 
The proof of the corollary is obvious. 
fandom vectors Z, considered in this paper are assumed to be independent 
but not necessarily identical. If identical distribution is assumed, then Assump- 
tion 2 is no longer necessary, and Assumptions 1 and 3 may be replaced, respec- 
tively, by Assumptions 1’ and 3’. 
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AssumMpPTION 1’. The second central moments 


E\(Zai Pa oi)(Zaj i 5) | 9 0°}, 


for a 1,2,---, mand fori,7 = 1, 2, --- , n, are finite. 
AssuMPTION 3’. Let @ = (0,, 02, -°--, &)’, for 1 < k S s. The expectation 
E(Z.) = (6) has continuous second partial derivatives with respect to ® , 


and the matrix V,(@) = 0¢(6) / 06, has rank k in the neighborhood of the true 
point 6’, foreveryk, 1Sk<s. 

In this case the strong law of large numbers can be used and the results will 
be stronger—the vector Z,, tends almost surely to the expectation ¢(@°), the esti- 
mate 6(Z,,) tends almost surely to the true value 6’, and all the other proba- 
bilistic statements will be stronger. 


4. Methods of generating RBAN estimates. In this section methods are given 
by which RBAN estimates can be obtained. Application of the first method 
(Theorem 5) requires the knowledge of the matrix 6*. In the second method 
(Theorem 6), such knowledge is not assumed. 

Tueorem 5. Let a quadratic form Q(Zm , 9) be defined by 


(19) Q(Zn, ®) = [Zm — &(0)|’6* '(0)[Z,. — %(0)], 


where 6*(@) is assumed to have continuous second partial derivatives with respect 
to @ in the neighborhood of ihe true parameter point @°. Let Assumptions 1 to 3 be 
satisfied. Then, 

(i) As m — ©, there exists, with a probability tending to one, one and only one 
function 6(Z,,) which locally minimizes the quadratic form Q(Zm , 9); 

(ii) The function 6(Z,,) is a consistent estimate of 0°; 

(iii) 6(Z,,) is regular in the sense of Definition 3; 

(iv) Vm (6(Z,.) — 6°] has an s-variate asymptotically normal distribution; 
and 

(v) 6(Z,.) has the asymptotic covariance matrix m ‘Co’, with Co = Vos* "Vp. 

Thus 6(Z,.) isan RBAN estimate. 

Because the proofs of Theorems 5 and 6 are analogous, only the proof of 
Theorem 6 is given. 

TuroreM 6. Let S,, be a consistent estimate of 6* at © = 0°, and let a quadratic 
form Q(Zm, Sm, 9) be defined by 


(20) Q(Zm, Sm, 9) = [Zm — &(0)'S>'[Z,, — (0). 


Let Assumptions 1 to 3 be satisfied. Then, 

(i) As m — ©, there exists, with a probability tending to one, one and only 
one function 6(Zm , Sm) which locally minimizes the quadratic form Q(Zm , Sm, 9); 

(ii) The function 6(Zm , Sm) is a consistent estimate of 0°; 

(iii) 6(Z,, , Sm) is regular in the sense of Definition 3; 

(iv) Vm [6(Zm , Sm) — 0°] has an s-variate asymptotically normal distribution; 
and 
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(v) 6(Zm Sm) has the as ymptotic covariance matrix m~ Co , with Co 

Thus, 6(Z,, ,S,,) is an RBAN estimate. 

In writing quadratic form (20), it is assumed that for every m 
the matrix S,, is positive definite with a probability one. 

Proor. Differentiation of the quadratic form with respect to 6 leads to the 
equation 


(21) W(Zm,Sm, 6) = V’(0)S;,[Z,. — ©(6)] = 0, 


with V(6) 0¢(6) / 80. Clearly, the derivative of Q with respect to 6 is equal 
to —2W. In order to prove part (i) of Theorem 6, we have to show that as 
m— ©, equation (21) has, with a probability tending to one, a root, 6(Z., . Sm)s 
in the neighborhood of the true parameter point 0. however small is the neighbor- 
hood, and that at the point 6(Z,, ,S,,), the quadratic form attains a minimum. 

Equation (21) is satisfied for Z,, = @’, S, = 6*, and 6 = @, since @ is written 
for ¢(@). Clearly the function W i a possesses continuous first partial 
derivatives with respect to Z,, and S,,. Assumption 3 on the differentiability of 
the function ¢(@) implies that W(Z,,, Sm, ®) possesses also continuous first 
partial derivatives with respect to 6. The derivatives taken at the point 
({, 6*, @) are 

ed = Sx [Zn — %(6)] - V'(6)S;V(0) 


00 ( ¢ ,o*,6 
ane Vie" Vo, 


where Vo V(@) is of rank s and 4 is positive definite ; hence, V,s* “Vo is 
positive definite. It follows from the implicit function theorem ({6], p. 117) 


that (a) there exists a region R containing ({ , é*) and a rectangular parallelepiped 


6*, 6**) containing 6 such that for every point, (Z, S), say, inside the region 
point 6 0(Z, S) inside the paral- 
lelepiped (6*, 6**); (b) the Jacobian |0W / 06) taken inside (6*, 6**) will have a 


constant sign; (c) the function 6(Z, S) is a continuous function and possesses 


R, equation (21) holds for one and only one 


continuous partial derivatives with respect to Z and S; and (d) substitutions of 
Z t and$ é* into 6(Z, S) lead to the equation 6(¢, 6*) 6, the true 
parameter point. 

Because of the convergence of Z,, to ¢ and §S,, to 6*, asm — ~, with a proba- 
bility tending to one, the point (Zm. Sm) Will be inside the region R, however 
small is R, and thus the equation (21) will hold for one and only one point 
6(Zm , Sm) inside (6*, 6**). 

It may be convenient to point out here that (c) and (d) imply the consistency 
of the estimate 6(Z,, , Sm). 

To show that the quadratic form Q(Zm , Sm, 9) attains a minimum at the point 
6(Zn, Sm), we let 6, (0,, 02, --+, &)’, for 1 <= k S 3s, and let V;,(6) 
0¢(6) / 00, , which is of rank k. For k = s, & = 6 and V;,(6) V(6). The second 
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partial derivatives of Q(Z,,,S,, 0) with respect to 6, are 


aQ 9 aVi(6 ) 
00; ~ 00, 


99) 


S7.'[Zn — ¢(0)] + 2V..(6)S;'V.(0 


The first term on the right-hand side of (22) tends in probability to zero, and 
the second term is positive definite in the neighborhood of 6°, for every 1 S k S s. 
Because of the convergence of 6(Z,,, S,,) to 6°, this means that the matrix 
V’(6)S;;' V(8) is positive definite for 6 = 6(Z,, ,S,.) and that all of the principal 
minors of the matrix are positive. It follows that Q(Z,,, S,., 0) is aminimum at 
6 = 6(Z,,,S) ({7], pp. 51-52). 

Regularity of the estimate 6(Z,, , S,.) is implied in (c 

Since 6(Z,,, . S,,) is a regular and consistent estimate of 6, it follows from 
Theorem 2 that +~/m [6(Z,. , Sn) 6] is asymptotically normal, proving part 
(iv). 


We now writ 
aw Sa 
’ > (6 
06 6-6 


) Sn'Zn — «(@*)] — V’(6*)S,,'V(e*)) (@ — 4), 
\ 06 0* 


where 67 6 + 3,,( — 0), with 3,, being an s X s diagonal matrix having all 
diagonal elements between zero and unity. Transposing the derivative of W to 
the other side of the equality sign gives 

/av’ oa ane: it 

(— |.) Sn'Zm — £(0*)]> § VoSn'lZm — 2'l, 
\ 00 0-0°/ : 


since W Z. tes 0 and WZ. Sa, 6) yy ee — w}. By Lemma 1, 


1/m (6 — 6°) has the same limiting distribution as 


4 /a / 7 Sa —! 
Vins V'(0")S;'veor) — (SIZ, — 200%) 


V’(6*)S;, V(e*) — 


ey 


or as 


(23) a/m { Vie*" Vo} *Vie* [Zn — 2], 


. ) ~ 0 
since Z, >, S,—06*, 6*—- 6, and 


{ ay’ \ 


< V'(6*)S,, V(0* } Sr [Zn — ¢(6*)] > Vos* Vo. 


/ \ 00 0=0° 


According to Theorem 1, the quantity (23) is asymptotically normal with a mean 


zero and with an asymptotic covariance matrix 
" ? ele )—! , xl) oar ( t 14-1 »—1 4 ely f , 14-1 ’ 
{Voe Vo! Vos id Vos Vo} Vo8 | 1 Vos Vo} 


proving the theorem. 
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In the practical application of the preceding methods there may be difficulties 
in solving such equations as (21) for 6, since the function ¢ might be a compli- 
cated function of the parameter 6. In such cases, devices suggested by Neyman 
[1] may be used. 

In the light of Theorem 4, an estimate of a function f(6), say f, may first be 
found and then f may be used to obtain the estimate of the parameter 6, pro- 
vided that the function f satisfies the regularity conditions assumed in Theorem 
4. One possible such function is f(@) = %(6). Assumption 3 on the function ¢ 
implies that the regularity conditions in Theorem 4 are satisfied. Thus the 
quadratic form (20) may be minimized with respect to { to obtain the RBAN 
estimate ~ and then the equation € = £(6) may be solved for 6. In doing so, 
however, it should be remembered that ¢ is a vector of nm components, and 6 
is a vector of s S n components. To ensure a unique solution of 6 from the esti- 
mate {, the function ¢ must be subject ton — s r restrictions before esti- 
mation takes place. Let the restrictions be represented by the equation 


(24) F° = F(t) = 0, 


where F’ is an r X 1 column vector. Equation (24) is deduced by eliminating 
the parameter 6 from the equation { = f(@), with f denoting the function of the 
parameter 6. The estimate obtained by this procedure is identically equal to the 
one found by directly applying Theorem 6. 

A second modification of the methods is suggested for the purpose of deriving 
an explicit formula for the estimate . Under Assumption 3 on the differenti- 
ability of the function %, the function F({) has continuous partial derivatives with 
respect to { and the matrix of the derivatives has rank r. Using Taylor’s theorem, 
we write the reduced form of the restrictions (24), 


(25) F*(z, Z,,.) =F + T(t — Z,) = 0, 
where F = F(Z,,) and 


oF (@) 
x 8. 


a 


The idea is to minimize the quadratic form (20) with respect to %, subject to the 
reduced form, (25), instead of the original restrictions (24). The resulting esti- 
mate is also RBAN. This is shown in the following lemma proved in [1], p. 257, 
in the case of multinomial random variables. 

Lemma 4. Let Q denote the quadratic form Q(Zn , 8) or Q(Zm, Sm, 9). If the 
minimization of the quadratic form Q under restriction (24) leads to an RBAN 
estimate of the expectation ¢°, then the minimization of the same quadratic form under 
the reduced restriction (25) will also lead to an RBAN estimate of ©. 

An explicit formula for the estimate of v’ is given in the following: 

Lemma 5. The function €(Z, ,Sm), which minimizes the quadratic form 


(20) Q(Zm, Sm, 0) = [Zm — &(6)!'Sn'(Z, — &(8)] 
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subject to the reduced restrictions (25), is given by 
(26) ¢(Z..,8.) = Z.. — 8.1T'P'F, 
with Pp aed « The corre sponding quadratic form ws given by 


(27) Q(Zm,Sm,%) = FPF. 


The first part of the lemma can be proved easily by using the Lagrange method 


as outlined in the proof of Theorem 3. A direct computation gives the second 
part of the lemma. 

In obtaining RBAN estimates in a practical problem, the essential part of the 
work is deducing the side restrictions (24). Once the side restrictions are deduced 
RBAN estimates can be obtained by a straightforward computation. A detailed 
description of the procedure is given in [8]. 

An application of the methods has been made to a stochastic process of flour 
beetles, and the work is being prepared for publication elsewhere. 
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THE LARGE-SAMPLE POWER OF RANK ORDER TESTS IN THE 
TWO-SAMPLE PROBLEM! 


By Mryvrer Dwass 
Northwestern University 


1. Summary. This paper studies the large-sample power of certain rank order 


tests against one-parameter alternatives in the two-sample problem. 


The first m of N independent random variables are supposed identically dis- 


tributed, each with a density function fi(z, 6), the remaining N — m with a 
density function f.(z, 6). When @ 0 both density functions are the same. 
Let ay, -*- , vw be a set of constants defined by (3.2) below; let by, --+ , bww 
be another set of constants; and let R,, --- , Ry be the ranks of the N random 
variables. A statistic of the type > 1 dybwr, is called an L statistic. 

Part I of this paper characterizes the locally best rank order statistic for 
testing Hy:@ = O against the alternative that @ is positive and “close” to zero. 
This turns out to be any one of an equivalent class of L statistics. Under certain 
regularity conditions it is possible to determine the large-sample power of L 
statistics. Of particular interest is the large-sample power of the locally best L 
statistic. 

For arbitrary by, --- , byw it is usually difficult to determine whether the 
regularity conditions hold. Hence, in Part II a special class of L statistics, the 
L, statistics, are studied. For these, the regularity conditions are easier to 
verify and the large-sample power is determined. The best J statistic can, in a 
certain sense, be approximated by Ly statistics. 


Part I 


2. Large-sample power. Suppose it is known that for every positive integer N, 
the joint c.d.f. of the random vector Xy = (Xm, --- , Xww) is a member of the 
one-parameter class of c.d.f.’s, {Fxue,0 S @ < ©}. That is, the distribution of 
Xw depends on a nonnegative parameter 6. 

Consider 

(a) the hypotheses, Ho:@ = 0, H,:6 > 0, 

(b) a statistic, ty = ty(Xw), and 

(c) a decision procedure: 

Accept H; ifty S Cy, 
(2.1) 
Reject Ho if ty > Cy. 


Let Ps( ) be the probability of the event in parentheses when @ is the true 


parameter. Then the power function of (2.1) is Peafty 2 Cy}, OS 0s ~. If 
Received January 19, 1955 
1 Much of the work of Part II of this paper was done under the sponsorship of the Office 
of Naval Research 
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ty is asymptotically normal and certain regularity conditions hold, then it is 
easy to approximate this power for large N. This is done in Theorem 2.1, below. 

The following notation is used: EY, Vars Y are the expectation and variance 
of a random variable Y when @ is the true parameter; 
OE ty | 

, 

00 

|@(N)} is the sequence of parameter values, 


6(N) = 6N~*”, 


E' x6 => 


(x) is the normal (0, 1) c.d.f.; and Aw is defined by 
hw = (Cy — Eo(tw))(Varo ty)”. 
THEOREM 2.1. (Pitman) 
ASSUMPTIONS. 


(a) Pow { (ty — Esuntw)(Varety) ” < s} + (s) as N — © for any s and 
any positive 6. 


(b) Ew» exists for all 6 in a half-closed interval |0, a), where a does not depend 
on N. 


(c) Ew ww)/En — 1, Varawytw/Varotn —- 1, as N — ~, for any positive 6. 


(d) Evo(N Varoty)”? + cas N 3 o. 
(e) Aw — A, as N — ~, where P(A) = 1 — a. 
Conciusion. Choose any positive « and positive 5. Then there is an N’ = 
such that 
Pof{ty = Cy} — (1 — (A — ON’’c)) | < «. 


for ON’” = 6, and all N = N’. 

Proor. See [13}. 

REMARKS ABOUT THEOREM 2.1: 

(a) As justified by this theorem, 1 — ®(A — 4c) is called the large-sample power 
of the test described by (2.1). 

(b) The following is a slight extension that is useful later: The statistics ty 
and ty are called asymptotically equivalent if (ty — Escxytw) / ((Varecyytw)' ) — 
(he = Bache) | ((Varacwytw)””) converges in probability to zero, as N — ~. Evi- 
dently the large-sample power is the same using either ty or ty . 

(c) Let ty , ig be two competing statistics, the first based on sample size N, 
the second based on sample size NV. If 


(é defined analogously to c), 


as N, N — , then the two large-sample powers will be the same. The number 
e is called the asymptotic efficiency of tg relative to ty . 


3. Rank order tests. The remainder of this paper deals with rank order tests 
in the two-sample prohlem. What this means is now made specific. 
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ASSUMPTION A. Xw = (X1,°** ,Xm,Xm4i;°** > Xmn) consists of m +n =N 
independent random variables. The first m are identically distributed, each with 
density function f,(z, 6); the remaining n are identically distributed, each with 
density function f2(z, 6). These are density functions with respect to Lebesgue 
measure on the real line, which satisfy the conditions 


(3.1) fila, 0) = fo(x, 0) = f(x), (—-«2o <2z< 


Alsom/N — K, (0 < K <1),asN- &. 
Assumption A holds throughout this paper even when not explicitly men- 
tioned. The hypotheses considered are always 


Ho:0 = 0, H,:6 > 0. 


DEFINITION 3.1. Let 2; = the number of X,,--- , Xy that are less than or 
equal to X; ; R;is called therank of X;,(¢ = 1, --- ,N). LetR = (Ri, --- , Rw). 

Assumption A implies that when 6 = 0, the random vector R takes on for its 
values each of the N! permutations of (1, --- , N) with equal probability 1/N1!. 
(The assumption about density functions implies that X,,--- , Xw have con- 
tinuous c.d.f.’s, and hence ties among them occur with probability zero.) 

Let the NV! permutations of (1, --- , N) be ordered in some fixed way. Denote 
these permutations by p; , po, --+ , pw: . Let S; be the set in N-space where the 
random vector R equals p; . 

Let S be any set of points in N-space which does not depend on @. Let 


f(x, 0) = [] fila; 0) Il fo(x;, 0), 


i=] jum+1 


N 
dx II dx; , 


i=] 


1,(0) = [eo dx. 


By the carrier of a density function f(x) is meant the closure of the set of points 
on the real line where f(z) > 0. The following assumption is stated for later 
reference. 

AssumpTION B. The carriers of fi(x, 0), fo(x, 6) do not depend on @ and the 
following differentiation under the integral sign is permissible: 


af (x, 8) | 


I,(0) = a hee 


Also I.(6) is a continuous function of 6 in some half-closed interval (0, a), a > 0. 

DEFINITION 3.2. A rank order test is a set W in N-space which is a union of 
some of the sets S;, --- , Sy: . Ho is accepted if and only if the observed value 
of Xw is in W. (In other words, the acceptance of Hp depends only on the ranks.) 
tw is called a rank order statistic if it is constant on each set S; . (In other words, 
ty depends on Xy only through the ranks.) 
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Theorem 3.1, below, states some obvious but useful facts about rank order 
tests. The following notation is needed for Theorem 3.1: Let 
~ 8P(S;) 
PS) = — (i 
00 bux) 
Let @’ be positive and let the two sets of integers (7; , --- , 7), (j1, °°: , 
determined by the requirements that 


Po (Si,) = +++ 2 Po (Si,) 2 +++ = Por (Six). 
and 
Po(S;,) 2 -*- = Po(S;,) = «+ = Po(Siyr). 


Let a = r/N1. 

THEOREM 3.1. Assumptions A and B imply that 

(a) A most powerful, size a, rank order test of Ho against the specific alternative 
that @ 0’ is given by 


W=S;,,u---uS 


Jig * 


(b) There is a number 6” = 6”(N) > O such that a uniformly most powerful 
size a, rank order test of Ho against the alternative,0 < 6 < 6”, is given by 


vr . = Y 
W’ = S;,u---uS;,. 


The proof is an immediate consequence of the definition of the sets (7; , --- , 7,), 
(jrs*** » Je) 

DEFINITION 3.3. Two statistics ty, ty are called equivalent (ty:ty) if 
ty = aty + b, where a, (a > 0), and b are constants which may depend on N. (It 
is easy to verify that “:” is a bona fide equivalence relationship.) 

DeFInITIon 3.4. A test W is said to be derived from a statistic ty if W is the 
set of points in N-space for which ty = C for some constant C. 

If W is derived from ty and ty:ty , then W is also derived from tw . (This follows 
immediately from the definitions.) 

Theorem 3.2, below, gives the structure of a rank order test which is uniformly 
most powerful for @ close to 0. The following notation is introduced: Let Zy: S 
Zyo S++: S Zyw be the ordered values of X,,---, Xy. Let 


HAz) = 2106569) ey = A(x) — Hil). 
06 Ox 


Let 

(n/mN)'”, “s+ mM, 
(3.2) ani = | pe | 
_—(m/nN)”, i=m+1,---,N. 


5 . 2 » . ° . 
(The facts that Say; = 0, Lay; = 1 are used later.) (2, without display of in- 
dices, hereafter means }>j_. .) 
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THEOREM 3.2. Assumptions A and B imply that the test W' of Theorem 3.1 is 
derived from 


(3.3) ty = LaniEoH (Zz,)- 


Proor. By differentiating and using Assumption B, 


m N 
NIPO(S;) = >> EoHi(Zwi,) + D2 EoH2(Zyi,), 
t=] t=em+1 

where |; , --- , ly is the value of R on the set S;. 

Set tv(Xw) = N!Po(S;) for Xw in S;, (¢ = 1,--+, N!). Next notice that 
LE.Hi(Zyi) = TE oH2(Zy;) = 0. This is because DE pHi(Zyi;) = DEoHi(X1) 
and 


7) Z wo : 7 ( 1 ) 
(3.4) EoH\(X;) = | H,(t)f(t) dt = = [ filt, @) dt =‘ = ( 


J--@ Caml) 06 


(Similarly for Hz .) Hence, 


ty = >> Eo (Zwm,) = — D> EoH(Zye,), 


i=l fmm+1 


and 
(3.5) ty = amity — Gv.maity = (N/mn)""ty , 


or ty:ty. This completes the proof. 

REMARKS ON THEOREM 3.2. 

(a) Suppose f;(z, @) is the density function of a normal (m; , 7) random variable 
(¢ = 1, 2) and that 6 = (m, — m,)/c. Then the statistic ty of Theorem 3.2 is 
equivalent to Lay:\EZye: , where the Zy; are the ordered values of N independent 
normal (0, 1) random variables. This was established differently by Hoeffding 
in [4]. 

(b) This is an example that will be used later. Let f(x) be a density function 
that does not depend on 6. Let F(x) = f2,f(t) dt. Let 


filx, 0) = f(z), 
fo(x, 0) = 20F(x)f(x) + (1 — 4)f(zx) (05681) 


(f2 = djoF* + (1 — 6)F]). Then for @ close to zero, Theorem 3.2 says the most 
powerful test is derived from ty = LaywiEZwe; where Zy: S --- S Zywy are the 
ordered values of N independent random variables, uniformly distributed on 
(0, 1). Since EZy; = i/N + 1, ty is equivalent to Lay,R; , which is equivalent 
to the Wilcoxon-Mann-Whitney statistic. This result was found by Lehmann 
by examining the probabilities P,(S;) (see [10)). 

(c) It would be interesting to know when a test that is most powerful for 
6 close to zero is uniformly most powerful for all @ in a wider interval. Teichroew 
[16] has presented some empirical evidence that this may be so in the normal case 
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discussed in Remark (a). This writer has computational evidence of this in some 
special cases. This is presumably an open problem. 

DeriniTIon 3.5. The statistic (3.3) is called the locally best rank order statistic 
and the derived test is called the locally best rank order test. 


4. The large-sample power of L tests. 


DEFINITION 4.1. Let byi, +--+, byw be a set of constants given for every N 
Let ayi, +++ , @vw be defined by (3.2). The rank order statistic 
(4.1) ty == Lanibwr; 


is called an L statistic. A test W derived from ty is called an L test. 

Theorem 3.2 states that the locally best rank order test is an L test. 

Lema 4.1. Let by, --- , bww 3 bwi, °-- , bww be two sets of constants given for 
every N. Let by , by be the averages of the two sets of numbers. Then 


Eo(Zayibyr,)(Zanibwe;) = U(bwi — by)(bys — by)/N — 1. 


This is proved by an elementary computation. 

The next theorem gives some information about the large-sample power of a 
rank order test derived from (4.1). It is assumed that 2by; = 0. This involves 
no loss of generality since it gives a ty equivalent to (4.1). It should be recalled 
that Zy, S --- S Zyy are the ordered values of X,,---, Xn, and that by 
Assumption A, m/N — K as N > ~., 

THEOREM 4.1. 

ASSUMPTIONS. 

(a) Assumptions A and B hold. 

(b) Assumptions (a), (b), (c) and (e) of Theorem 2.1 hold. 

(c) N7 LbyiEoH(Zx:) @ c', N* Thy: — (c”), as NO @. 

Conciusion. The large-sample power of the rank order test derived from (4.1) is 
1 — &(\ — ON"’c), where c = K'?(1 — K)'c'/c”. 

Proor. The only thing that needs to be verified is Condition (d) of Theorem 
2.1. Let ,-+--, ly be the value of R on S;. Then 


dE sty 
OO bm 
(mn)"?N“?(N — 1) Sbyi EoH (Zwa) 


by (3.5) and by Lemma 4.1. Also Varo ty = (N — 1)” Db; by Lemma 4.1. 
Hence 


, 
Exo 


I 


- i (Law; bwi,)Po(S;) / N! 


ll 


(Eno)(N Var ty)” > c. 


REMARKS ON THEOREM 4.1. 


(a) If ty is the locally best rank order statistic (according to Definition 3.5), 
then 


c = K(1 — K)lim N™ S[E.H(Zy,)/. 


N+@o 


7. 
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Hoeffding has studied limits of this sort in [6]. His results involve restrictions, 
some of which do not seem appropriate for the problem here. (The main restric- 
tion requires the convexity of H(x).) A reasonable conjecture on the basis of 
Hoeffding’s work in [6] is, however, that 


N“S[E.H(Zyi)) > / -H?(x)f(2) dz, as N > « 


(f is defined in (3.1)). This conjecture tends to be borne out by the work of Part 
II of this paper. 

(b) Let (fi(x, 0), fo(x, 0)), (f(x, 0), fo(z, @)) be two (possibly different) sets of 
alternatives. Let f(x) = fi(z, 0) = iy 0) and let Ey , Zx; , H be defined in exact 
analogy to Ey, Zx;, H. If the actual alternative is (f, , fo) and the test is derived 
from ty = Lay:Eof(Zx;), then 


Zwi)EoH' Zui) 


Ae. )) 


Let F(x) = f= f(t) dt and suppose x= p(t), the inverse of F(x) = t, exists. Make 
analogous definitions of /’, 5. Since 


20 ] 
| H°(z) f(x) dx = | H°(p(t)) dt, 


a possible extension of the conjecture in Remark (a) would be that 
NE, A(Zy;) Eo H (Zyi) - [ A(p(t))H(o(t)) dt 
as N — o, in which case 


ol 
| A(@W)H(o) at 
e= K"*(1 — K)?= 


| F1(p(t)) at) 


This conjecture also tends to be borne out by the results of Part IT of this paper. 
(c) If 


(4.2) ty = Lay.p(R./N), 


where p(x) is a polynomial (0 S z S 1), then it will be shown below that the 


assumptions of Theorem 4.1 hold under easy conditions. The importance of 
this approach is the following: Let r(x) be a continuous function on (0, 1) for 
which r(i/N) = Eph(Zyi). As r can be approximated by a polynomial p(x) of 
high degree, it is reasonable that the large-sample power of the test derived from 
(3.3) should be approximated by the large-sample power of the test derived from 
(4.2). In Section 5, a heuristic upper bound is given for the large-sample power of 
the locally best rank order test. In Part II it is shown that this upper bound can 
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be approximated as closely as one pleases using rank order statistics of the form 
(4.2), where p(x) is of sufficiently high degree. 


5. A heuristic upper bound for the large-sample power of a rank order test. 
By the Neyman Pearson lemma, an optimum statistic for testing Ho against the 
alternative of a specific @ is 


(r _ 1, flXi, 8) . folX; , 8) 

(5.1) ty = » 108 F(X, 0) oe ZX, log 7(X., 0)" 

Of course, (5.1) is in general not a rank order statistic. As a sum of independent 
random variables, (5.1) will under quite general circumstances be asymptotically 
normal. Assuming that the conditions of Theorem 2.1 hold, a heuristic derivation 
of the large-sample power of the test based on (5.1) is now given. (Notice that 
here fy depends on 6, which is not the case in Theorem 2.1.) 


(Esty — Eoty)/NO = = | log fue, 6) = log fulz, 0) , falz, @) _ fae dx 


mn [ log fo(x, 0) — log fo(z,0) fox, 0) — fo(x, 0) 
© Bin ioe 


5 5 dz. 


. y—1/2 ti . . 
Set 6 = 6N and assume that the limiting operations involved may be inter- 
changed. Then 


(Esty — Eoty)/N@ —K [ Hi(z) f(x) dx + (1 — K) / H3(x) f(x) dz, 


(5.2) 
as N — o. 
In a similar way one finds that Varo ty/N@ approaches the same limit. (The 
computations use (3.4).) Hence, 


(Est —_= Ecty)” 


¢ = te [eee 
N@ Varo ty 


is equal to the right-hand side of (5.2). The Neyman-Pearson lemma implies that 
1 — #(A — 6c) is an upper bound for the large-sample power of the optimum 
rank order test. 

The alternative considered above is that the distribution of X,,---, Xw is 
determined by (fi(z, 0), fe(x, @)). If one considers instead the alternative that the 
random variables are distributed as T(X,), --- , T(Xw) where T is an increasing 
function, then there is no difference between the ranks of the X; and the ranks of 
the 7'(X;); hence, the power of one rank order test against either of these alterna- 
tives is the same. Thus, the upper bound found above may be tightened if in the 
alternative (f(x, 0), fe(a, @)), fi: and fe are replaced by the density functions of 
suitably transformed variables. In particular, suppose that the N random vari- 
ables are distributed as 


KF,(X;, 0) + (1 — K)F2(X;, 8) 
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where 


z 
F(x, 0) - | f(t, 0) dt, 


Let ft (x, 0), f(x, 0) be the density functions of the transformed random variables. 
It is easy to verify that 


(5.3) Kft (x, 0) + (1 — K)ft(a, 0) = 1. 


Hence, for purposes of obtaining an upper bound for the large-sample power of a 
rank order test, there is no loss in supposing that (5.3) holds for fi , fe in the 
earlier argument. An easy calculation shows that under condition (5.3) on fi , fo , 


00 
= K(1—K) | H*(2) f(x) dr. 
J 
Notice that this ties in with the conjecture of Remark (a), Theorem 4.1. 

The development of this section is extremely heuristic and unrigorous. Suitable 
regularity conditions can no doubt be put down to make everything correct. 
Since the results of this section will not be used, the matter will be left as it is. 
In the case of the usual examples of particular density functions, f; , fo , the results 
can often be verified directly. 


Part II] 


6. U statistics. In this section, U statistics are studied in order to obtain in- 
formation about the related statistics (4.2). 

DeFINITION 6.1. Let w(a, +--+ , Zp; Tpi1,°** » Lpeq) be a function of p + q 
variables which is symmetric in the first p variables (that is, invariant under all 
permutations of the labels 1, --- , p) and which is symmetric in the last q vari- 
ables. Such a wu will be called a (p, g) symmetric function. 


Let a1, °°*,@p381,°°: , Bq be p + q integers subject to the restrictions 


(6.1) < C-«< eg, 2a < ert Sa hh < *-- < & SA. 


; m : n / mF if 
There are, of course, ( ) sets of a,’s and ( ) sets of 8,’s satisfying (6.1). 
\Pp \7 


DEFINITION 6.2. Let 


2, “ i (”) nas stent Xa, ieee Xp.) 


Pp 7 


iis 
, (m\(n ' - , 
where 2’ means summation over all ( ) ( ) choices of the indices, subject to 
P/Q 


(6.1). Any statistic of the form (6.2) is called a U statistic. This generalizes the 
terminology of Hoeffding [3], who studied the case q = 0. 

The first problem is to study the asymptotic normality of U statistics as 
N — = (p, q fixed). Under suitable regularity conditions, asymptotic normality 
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was established by Hoeffding [3] for the case q 


0 and by Lehmann [9] for the 
case Pp 


q. (Lehmann’s proof was, for simplicity, done for m = n.) What follows 
admits the possibility that uw may depend on N and that @ may depend on N. 
The methods are essentially those of Hoeffding in [3). 

Let u, uy be two (p, g) symmetric functions. Assumption a, below, will pre- 
scribe a certain sense in which uy — uas N > «. Let Uy denote the U statistic 
determined by uy as in Definition 6.2. Let 6 = 6(N) be a function of N such 


that 0(N) — 6 as N — ~. (This includes the possibility that 6(N) = 6 for all N.) 
DEFINITION 6.3. 


Une = Unrs(Lay °°" » Lap 3 Tp, 5°** 5 Lp,) 


Boetty(Sa; 5 °** 5 Lag» Karas °** 9 Kays Vrs °° 9 Tor ABogrs °** » Abe) 


My Epttne.(X a5 °** » Xap 3 X8,5°**» Xess) (this also equals Epuy) 
Ve un — My ’ 
— Myp, 


PNn,e = EW ye ; 


Vwor.e UN 


‘ree 


PN 6.r EW w 6.r.e5 


r= 0, i, --* , p3s = 0, :. oes ig 
By deleting the subscript N 
ey ee eee Pe , Po.r.s « 


threughout, analogous definitions are made of 


Since X,, --- , X,, are identically distributed and X,,4;, --- , Xw are identi- 
cally distributed, the values of My», M, 


Pv. , P05 PN.6.r.25 P9.r.2 dO not depend 
on the labels a, ,--- , ap; Bi, °° 


- , B,. It is understood that Vy op. = Vre, 
Vy e00 0. (A similar convention holds for N removed.) 
ASSUMPTION @. 


My ¢@-— Me, ’ PN.6.r.8 — P0o,r,2 


, gq. (Recall that 6 = 6(N).) 


m N 
Yy = p(Km) ~ 2 Csesd Xo + g((1 — K)m) ” = Ww.o.0.1(X;). 


t=1 


t= ™m-+ 


Evidently Var Yy — (p°/H)pe.01 + (q°/1 — K)po1 = Las N > ~, by 
Assumption a. 
ASSUMPTION 8. 
Y y is asymptotically normal (0, L'”) as N > « and m/N > K, (0 < K < 1) 
Since Yy is a sum of independent random variables, Assumption 8 is not very 
restrictive. It will be satisfied, for instance, if wy converges in probability to 
uas N — © and max(pe,0.1 ; p@).1.0) > 0. (See [2], Theorem 3, p. 101.) 
LemMa 6.1. Assumption a implies that N Varp Uy — L, N Vary U — L, as 
N— «~, m/N — K. 
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PROoF. 
N Vary Uy = N eS (”) DY 2° ee AX... °*:: XupiXns*** Xa) 
P, q s=0 r=0 
[Ww (Xe, a oe Xa’, ; Xs", — 2 Xp'q)l, 


>(r,8) . . : 
where >" means summation over those subscripts where exactly r equations 


, oa F , al . : 
a; = a; are satisfied and exactly s equations 8; = 6; are satisfied. Each term in 


ae . ; ‘m— Dp / 
>" is equal to py.¢,-,- and the number of such terms is (?) ce »)(™)(2) 
r ae 2 Pp 8 


n— n pal = . 
(" _ (: ). A similar expression holds for U. The required result follows on 
“= q 


taking limits. 

THEOREM 6.1. Assumptions a and B imply that N*”Uy is asymptotically normal, 
(0, L’”), as N > @~. 

Proor. It is sufficient to show that E,(N’?Uy = Yy) — 0Oas N — o~, Let 
Dy = N Vary Uy + Varo Yy — 2N’?E,UxYy . Consider the facts that 


Ee wo,1,0(X Vw ol X a, gs Ae ; Xs, 4 9% Xz,) 
( in os ° 
| pw.e1,0 if 7 is one Of a, -+- , ap, 


\0 otherwise; 


LH vsalXIGudXe,,°** Xai Xns°°- > Xn) 


| pw.o.0.1 lf 2 is one of B,, 


0 otherwise. 
Hence, 
N'*E,UxYy = p(N/mK)'"pxoa.0 + G(N/n(1 — K))*pw.e.0.1 
— (p'/K)p01.0 + (7/1 — K)pe.01, 
and Dy ~ 0 as N > «. 
7. Ly statistics. 


DEFINITION Zk. Let p(t) = byl tote bf’ (0 < t < 1) be a polynomial 
with real coefficients and let 


(7.1) ty = ~ ayip(Ri/N). 


Such a ty is called an L, statistic. (Notice that including a constant term in p(¢) 
gives an equivalent ty .) The test derived from ty is called an Ly test. 
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The main purpose of this section is to show that an L, statistic is a U statistic 
and is asymptotically normally distributed. Define 
I if y > 0, 


c(y) 
0 ify <0. 


N 


Then with probability 1, R; = 1 + >> c(X; — X;) and 


= 
R? = 2B (X;— Xs) «+> (Xi —Xy) + °° 
+ 3B e(X; — Xj,) +++ (Xs — Xj.) + °° 
+ S°Byrc(X; — Xj.) 
+1, 


where =“ is summation over all subscripts satisfying 1 S ji: < jo < ++: <jr SN 
(( = 1,--+, p). Bye depends on p and on ¢t but not on N and not on i. Since 
c(X; — X;) = 0, it can be assumed that none of 7; , --- , 7: is equal to 7. The 
expansion of R? makes use of the fact that c’(y) = c(y) (j = 1,2,--- ). 
Consider the ¢ + 1 integers 7, j:, --- ,j:. Let a, --- , a, be those that are a 
subset of 1,---, m and let 8:,---, Bry1-. be those that are a subset of 
m+ ies A. 
DEFINITION 7.2. Define us”... by 


NO Md weds ** 5 Mag 3 Mans **s Ted 
ayic(X, es X j,)e(X; = X j2) a e(X; i? X 5.) 
+ ay 5,C(X j, es Xi)e(X 5, roe X jz) . c(X;, = X j,) 


+ ay;,c(X;, — X;,)ce(X;, — Xj,) +--+ 


(p) (Pp) 
uot =Uoi = 0, 


ui?) = 1, Osssi+l, OsStsp. 


Notice that u?,.,:_, is an (s, + 1 — s) symmetric function (see Definition 6.1). 
Consider the ¢ + 1 integers a; , 8; satisfying 


(7.2) lsa<-++'-<aSm™; m+1sBi<-:-- < Bar, SN, 
and the 2(h + 1) integers a; , 8, satisfying 
(7.3) lsa,<---<aySm; m+1sSBi<-:- < Pu dN. 


rT . . . . . / 
The first set is called an associate of the second set if every a; is some a; and 


. , rin . . . ° . 
every §; is some 8; . Thus, any fixed set satisfying (7.2) is an associate of 


m— 8 a-t—1+e\.. are 
(, ee a Jl ot elas ) different sets satisfying (7.3). 





364 MEYER DWASS 


DEFINITION 7.3. Define uy by 


—1 \—1 
ri/2 / m n , , , , 
W'\at :) (, + .) UwlKay o> y Xangs i Xai» -** » Xbn4s) 


b p t+! ant —1 
a iz = 7 m-— 8s ) n—-t-—l1 + 8 Syl? 
= a ° 1— 
pal t=O sm8 h+1— 8, h-—t+s —" 
where >” is summation over all the u,”4;—, terms (for fixed s, t, p) whose indices 
are associates of a,,--- , @as13; 81,°°* , Baa. 
DEFINITION 7.4. 


aa, m ) n ) - 
r=(A 51 ‘aa; “™* 


where >’ is summation over all sets of the 2(h + 1) indices satisfying (7.3). 
The following theorem now follows directly from the constructions made 
above. 
THEOREM 7.1. 


ty = N’?Uy, where ty is given by (7.1). 


- 


LemMMA 7.1. uy has the following properties: 

(a) The 2(h + 1)-dimensional space over which uy is defined is partitioned into a 
finite number (which depends on h but not on N) of disjoint sets on each of which 
Uy assumes a constant value. 

(b) limy.. Uy = uezistsasN—-«x,m/n->K (0<K<1 

(c) uy ts a (h + 1, h + 1) symmetric function. 

(d) Uy is a U statistic. 

Proor. (a), (b), (c), (d) all follow from the constructions made in definitions 
7.1 and 7.2. In particular, 


h p+i ‘ =~} 
» . m 7 m=+—- 8s 
w= limuy = tim (dy 4 Masta 


== () / 
\ \ 


n—-p—1+ - y-PHM2 a 9, 
h—pt+s ; - 


LEMMA 7.2. uy , u satisfy Assumption a. 

This follows from (a) and (b) of Lemma 7.1. 

As in Section 6, it is still supposed that @ = 0(N) ~ @asN — ~., 
LEMMA 7.3. 


h 
(a) Varoty > >, bo; +7 + D7 G +17 G4 177 =e, 


i,jal 


(b) Varg ty — ao asN—- ~, 
Proor or (a). By Lemma 4.1, 


Varo ty = Z[p(i/N)P/(N — 1) — [2 p(i/N)P / N(N — 1). 
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The lemma follows from the fact that 


N 
> */N* > 1/r+1lasN > @ (r = 1,2,°°:). 
tol 

Proor or (b). Because Assumption a is satisfied, Vary N“* Uy and Varo N“’Un 
approach the same limit as N — . The proof follows from the fact that ty = 
N""Us . 

Derinition 7.5. Let P(x) = po + par +--+ pv’ (i = 0,1, 2,---) 
be the system of orthogonal polynomials associated with the weight function 
x’ over (0, 1); that is, 

el 
(7.4) P(2)P,(x)x* dz = (° ee 
7 ih. ‘= 2. 

In other words, {P;(x)x} is an orthonormal system. It is well known that this 
orthonormal system is complete with respect to the Lebesgue square integrable 
functions on (0, 1). For the very basic Hilbert space information needed in the 
remainder of this paper, see [15]. 

DEFINITION 7.6. 


Ph—1,1 coe) Psi hi] 


ss A/(h + 2)\ 


bos Ih +8) 


i/(h+2) 1/(h+3) +++ 1/(Qh+)/ 


D 


h/(h + 1), 
’b, 
. ’ b 
\1/h + 1/ bn/ 
Any of the above matrix or vector symbols with a prime means a transpose. 
Lemma 7.4. A is a positive definite matriz. 
Proor. First notice that 1/7 + 7 + 1, the (7, 7)th element of A is equal to 
foc? *2 2? dx. Hence, the orthogonality conditions (7.4) mean that 


PAP’ = I (the unit matrix). 
Hence, P is nonsingular and 


A = Pp, 
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which proves the positive definiteness of A. (The idea of this lemma was sug- 
gested by [1].) 


THEOREM 7.2. Suppose max (\b:|, --- , |bs|) > O and m/N > K (0 < K < 1) 
as N > ~, 


Then 
Po{ (ty — Eotw)(Vare ty)” < 8} — (s) 


as N — ~. (Recall that 6 = @(N).) 

Proor. By Lemma 7.3, Vary ty — o = b’DADb. Hence, o° > 0, since DAD 
is positive definite by Lemma 7.4. By (a) and (b) of Lemma 7.1, uy converges in 
probability to u as N — o; hence assumption @ is satisfied. (See the remark 
following statement of Assumption 8.) The required result follows from 
Theorem 6.1. 


8. The large-sample power of JL, tests. 
AssuMPTION C. 


0< [ H’(x) f(x) dx < @ (Lebesgue integral). 
Lemma 8.1. Assumptions A, B, C imply that 
_Jnryi 
a N+ ray, SEARINY |, 
00 | Oud 


where 


l; = K"*(1 — K)"” [ F H(x)F*(x) f(x) dx, and F(z) = [ . f(o) dt. 


Proor. Let P;, be the probability that the random variable X; has rank t, 
(i,t = 1,---, N). Then 


m N N 
Py = 2’ {0 filz:,0) I] falxzi, 0) [] dz;, 
tonl t=m+1 i=l 
where f is over the set where z;, < +--+ < 2j,_, < 2% < 24, < +++ < Xjy and 
>’ is over all permutations j;, +++ , jer, Jest, °** » Jw Of the N — 1 numbers 
bO ise ge £6 +4, +° 
An elementary but tedious computation shows that 


OP x - Sit ") r t—1 _ Blx)\*—* 
3 |ea ~ MN —1) . [ H(x)F(x)" (1 — F(x))" f(x) dz, 
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Let y‘” denote the factorial y(y — 1) --- (y — j + 1). Then 


( — 1)\f) | ‘ 
aby EE] _ b= 0 aPa| 
al ra Ni 00 | om0 
06 6—0 


3(N — 2) N74 r H (x) F(x) f(x) dx 


by another routine computation. The result of the theorem follows from the 
evaluation of the limit of 

N 7 \@ 7? | 

- dE(R; — 1)" /N’ | 

n+ on, dE(R; — 1)*/N’ | 


—— as NN > a, 
tol 06 | Oma) 5 





and from the fact that this limit must be the same as the limit of the left-hand 
side of (8.1). (Notice that by Assumption C and by the Schwartz inequality the 
integrals displayed above all exist.) 
h 
is ( | 
l, 


DEFINITION 8.1. Let 
and define B by B’B = DAD. (This factorization makes sense by Lemma 7.4. 
Also, by the same lemma, B is nonsingular.) 

The next theorem shows that the theorems on large-sample power can be 
applied to Ly, tests. 

THEOREM 8.1. 

ASSUMPTIONS: 

(a) max {|bi|,--- , |b} > 0, 

(b) Aw — A, as N — ~, where (A) = 1 — a, 

(c) Assumptions A, B, and C hold. 

Concuusion. The large-sample power of the test derived from (7.1) is 
1 — (A — bc), where 


(8.2) c = (b/l)(b'B’Bb)*” = (b/l)(b’.DADb)™”. 


Proor. The proof will follow by verifying the assumptions of Theorem 2.1: 

Condition (a) of Theorem 2.1 holds by Theorem 7.2. 

Conditions (b) and (c) of Theorem 2.1 hold by Theorem 7.1 and Lemma 7.1. 

Condition (d) of Theorem 2.1 together with the explicit value of c follow from 
Lemmas 7.3 and 8.1. 

(Notice that it is no loss to suppose c = 0, since otherwise the statistic —ty 
could be used.) 

CoroLuary. The maximum large-sample power obtainable with a p,-rank order 
statistic is 1 — (AX — 62), where 


é = {(DAD)"Iy”. 
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Proor. c = (b’B’B’'l)(b'B’Bb)”. By Schwartz’s inequality, 
c < (b'B’Bb)('/B" BI) (0'B'Bb)* = (UBB). 
This maximum is achieved by choosing the coefficients of the polynomial p(t) as 
b = (DAD)"'l. 


DEFINITION 8.2. The 1, statistic which maximizes the large-sample power 
(for fixed h) is called the locally best L, statistic. The test derived from it is 
called the locally best L,, test. 

Assumption D. F(x) = J2, f(f dt is an increasing function of x whenever 
F(z) is not zero or one. 

Let F(x) = t. Assumption D implies that this function has a continuous 
inverse, x = p(t) (0 <t < 1). 

DEFINITION 8.3. 


g(t) = K'°(1 — K)"* H(o(), 


t 
G(t) = / g(x) dx 
0 


el 


| H%(e() dt = | H(z) f(@) ar, 
“0 a) 


Assumption C implies that g(t) is Lebesgue square integrable on (0, 1). What 
follows also requires this to be true for G(t) / t and Assumption E, below, is a 
convenient way of insuring this. 

Assumption E. G’(t) /t—Oast—c. 

Lemma 8.2. Assumptions C and E imply that G(t) /t is Lebesgue square in 
tegrable on (0, 1). 

Proor. G(t) is continuous and differentiable at any point of (e, 1) 
(0 < « < 1), hence f, G’(t) / ¢ dt exists. Differentiating by parts and using the 
Schwartz inequality implies that 


1 
[ Go / eat —G 


ve 


“1 
(e—) /e+2 G(t)t'g(t) dt 


‘ el ‘ 1/2 wl ; 1/2 
—G(e) /e+2 | Git? at | || g (t) at| 


The left-hand side of Assumption C implies that [f? G’()t~ dt'® > 0; hence 
dividing by this quantity gives the desired result. 
Lemma 8.3. Let 




















RANK ORDER TESTS 369 


Let s(t) be a function on (0, 1) such that 


Then the vector 


V1 Poo 
Vi Pro + V2 Pur 
Py = I I 


Vi Pr—1,0 + v% Ph—i1,1 + soe + uy Ph—1 h—1/ 
is equal to 


~1 


| P,(t)ts(t) dt 


~( 


al , 
} P,_1(b)ts(8) dt} 
\ 40 ) 
In other words, the jth element of Pv is the jth Fourier coefficient of s(t) with 
respect to the orthonormal system {P,(t)t}. The proof follows simply from the 
fact that 


1 j ~1 j 
| P;(t)ts(® d= pm | Pii "id s(t) dt = be Pit Veqi - 
0 0 t=O 


tan 


Recail that the corollary to Theorem 8.1 says that the maximum large-sample 
power obtainable with a p,-rank order statistic is 


1 — (A — 8), @ = [l’(DAD)"T}"”. 
The fact that ¢ depends on h is now stressed by writing @, . 
Lemma 8.4. Assumptions C, D, and E imply that 
1 00 
a + | g(t) dt = K(1 — K) / H’(x) f(x) dx, ash— o., 
0 a) 


Proor. Using integration by parts, 


1 1 1 
[ g(t) — G@) / dt’ dt = [ g(t)t? dt — [ G(t)t?* dt 
“0 “0 


“0 


1 


PG+y [ o@é a =7G+ ve, 
“0 


By Lemma 8.3, the vector PD™'l is the vector of the first h Fourier coefficients of 
g(t) = G(t) / t. Hence, by Parseval’s Theorem (see [15]) 


1 
f [g(t) — Gd / ) dt, ash— o. 


'D"P’PD™"1 - 
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Integration by parts gives fo g'(t) dt for this last integral, which proves the 
lemma. 

It should be noticed that Lemma 8.4 implies that Assumption (a) of Theorem 
8.1 is satisfied when b = (DAD)"'l. 

The results of Theorem 8.1 and Lemma 8.4 can be summarized in the following 
Theorem 8.2, which together with Theorem 8.3, might be considered the main 
results of Part II of this paper. 

THEOREM 8.2. Suppose Assumptions A through E hold and }\y ~ \ as N > ~. 

Let g¥i be the power function of the best L, test. Choose any positive 
¢ and a positive number 6. Then there is an N’ = N’(e, 6) and an h’ = h'(e, 8) 
such that 


lexe-(8) — (1 — ®(A — ON*%c)| < « 


for ON"? = 6, and all N = N’. 


~1 1/2 
c = (| g°(t) at) 
\70 


REMARKS ON THEOREM 8.2. 

(a) Roughly, this theorem says that the large-sample power of the best L,, test 
approaches 1 — #(\ — dc) as h, the order of p(t), is made large. 

(b) This theorem lends credence to the conjecture made in remark (a) of 
Theorem 4.1. It is reasonable that the “polynomial approximation’”’ to the best 
rank order test should behave almost like the best test itself, but this does not 
constitute a proof. 

Let (fi(z, 6), fo(z, 6)), (f(z, 6), fe(z, 0)) be two (possibly different) sets of 
alternatives. Let g(t), G(t), A(x), l, a(t) be defined analogously to g(t), G(t), 
H(z), l, p(t). 

THEOREM 8.3. Suppose both sets of alternatives (f; , fe), (fi , fe) satisfy Assump- 
tions A through E. 

Let ty be the locally best L, statistic against the alternative (f, , fe). Then, if the 
true alternative is (fi , fe), 

(a) the large-sample power of the test derived from ty is 1 — ®(X — dcn), where 


c, = (U(DAD)"*))(U(DAD)"))™. 
(b) 


I g(t)g(t) dt 


Ch > =———=3 =C 


If 70 a} 


Hence (in the sense described in Theorem 8.2), the large-sample power of the test 
derived from ty approaches 1 — #(\ — 4c) as the order of p(t) is made large. 
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Proor. Part (a) follows from Theorem 8.1 by setting the b vector in (8.2) 
equal to 


b = (DAD)""l. 


Part (b) follows from considerations exactly analogous to those used in proving 
Lemma 8.4. The vectors PD™'l and PD™'l are the vectors of the first h Fourier 
coefficients of g(t) — G(t) /t and g(t) — G(t) /t, respectively, Hence, by Par- 
seval’s theorem, 


1 
’D'P’PD™1— [ (g(t) — G(@) / Hg) — GO /ddt 


1 


= | g(tg(t) dt, ash— o, 
“0 


Corotuary. If the locally best L statistic against (f, , fo) is an Ly statistic for 
some h = h’, then 


C, = C, for allh = h’. 


Proor. This is because the locally best LZ, statistic for all h = h’ must be 
the locally best Ly statistic. 


9. Applications and examples. 

I. Location-parameter alternatives. Let f(x, 6) = f(x + m(6)) ( = 1, 2), 
where f(z) is a density function not depending on @; m (0) = m(0) = 0, 
m:(0) — m2(0) ¥ 0. Let p(t) be the inverse of F(z) = t where F(x) = f%,, f(t) dt. 

Evidently, the carrier of f(z) must be (—«, ) if Assumption B is to be 
satisfied. (This, of course, is not sufficient for Assumption B.) It is easy to verify 
that 


g(t) = Df'(e(®) / fe), 
G(t) = Df(p(t)), | where D = (m\(0) — m,(0))K"?(1 — K)*”, 
and that Conditions C through E hold if 


(9.1) [ (f’(x))’ / fla) dz < @, 


and f’(z) —~ 0 as zx — —., (L’Hospital’s rule is used, together with the fact 
that f(—«), f(@) must equal zero. It is easy to verify (9.1) for the normal 
density function, for instance. For the usual pathological example, the Cauchy 
distribution, these conditions can be shown to hold also.) 

One can make quite similar remarks about scale-parameter alternatives, 
where 
(9.2) f(z, 0) = aah=m) and o;(0) = 1 j= 1,2. 

7 o;(@) o;(@) , : : , 
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II. Normal alternatives. Suppose f,(x, @), fe(z, 8) are normal density functions, 
(m,(@), 1), (m2(@), 1) respectively, where m; — m, = 6. Then an easy calculation 
shows that fj g°(t) dt = K(1 — K). For @ > 0, there is a uniformly most powerful 
similar test based on > ent X;/m— > in maac/n= Ty. As 


Ey o(N Varo Ty)? > K7(1 — 


as N — « and 6 = 6N™”, it follows that the large-sample power of the locally 
best L,, test is, for large h, arbitrarily close to the large-sample power of the test 
based on Ty. 

In an exactly analogous way one can treat the scale-parameter alternative 
with normal density functions. In that case, also, the locally best rank order 
test has the same large-sample power as the test based on the F statistic. 

Ill. The asymptotic efficiency of the Wilcoxon-Mann-W hitney statistic.” 

Let fi(z, 0) = f(x), the uniform density on (0, 1), and let f.(z, 6) 
2(0)F (x)f(x) + (1 — 6)f(x). By Theorem 3.2, the best rank order statistic against 
this alternative is equivalent to 2 ay;R;, which is equivalent to the Wilcoxon- 
Mann-Whitney statistic [11]. This must then be the best L, statistic for all 
h = 1. Let fi , fe, f be as described in Example I on location-parameter alterna- 
tives. We consider the alternative given by density functions f; , fe . Let ty be the 


locally best rank order statistic against that alternative. By Theorems 8.2 and 


8.3 (Corollary) and an elementary calculation, the asymptotic efficiency of 
z ay;f; relative to ty when the true alternative is fi; , fe is the square of 


dy (le 2 F(x)) f'(x) dz 2/3 a i dx 


-— 2 1/2 ’ “2 
| | f'(x) | {LE (x 
Li, (2i — vat i (f@ ). f(a) i eS A 5) f(a) dx 


The role of (f; , f2) can be dropped. The number (9.3) is the asymptotic efficiency 
of = ay;F; relative to ty where the true alternative is (f, , fe). 

Letting (fi, fe) be normal densities (m;(@), 1)(m2, (6), 1) where m(6) — 
m.(6) = 6, (9.3) turns out to be +/3/z, hence the asymptotic efficiency is 3/7. 


By (II), above, this is also the asymptotic efficienc Vy of = ay;f; relative to the ¢t 
statistic. This result was apparently first given by Pitman in [14]. (See also 
(12].) 

IV. A rank test for dispersion. In [12] Mood suggested a rank order statistic 
(against a dispersion alternative) equivalent to 2 ay{R; — (N + 1)RJ. This 
statistic is asymptotically equivalent to 


(9.4) Day{(R; /NY — R;/N). 


(See Remark (b)), Theorem 2.1.) 


* See Remark (c) to Theorem 2.1 for definition of efficiency 
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Let fi(x, 0) (1 — 6)6[F(x) — F*(x)\ f(x) + Of(a), 
fo(x, 6) = f(x), the uniform density on (0, 1). 


By Theorem 3.2 the best rank order statistic against (f, , fo) is asymptotically 
equivalent to (9.4). Let ty be the best rank order statistic against the alternative 


(fi, fo) described by (9.2). Theorems 8.2, 8.3 (Corollary), and an elementary 


calculation show that the asymptotic efficiency of (9.4) relative to ty when 
(fi: , fe) is the true alternative is the square of 


| 6(F’(x) — F(x) + 1)(2f'(x) + f(x)) dz 


; —-., eS 


(| l6(¢ — t) + 1} at) (] [af’ (x) + f(a)}° / f(z) az) 
As in (III), above, the role of f; , f, can be dropped. 


If for the alternative (f; , fe) as given by (9.2) f is the normal (0, 1) density 
function and 


a;\0) 
=1-—49@ 
o2(6) 
hence Hy, implies the variances are the same), then (9.5) becomes 
~® 


| (6F(x) — 6F°(x) — 1)(1 — 2°) f(x) dx 


oe 


ag iy = VI 2 
| (6° — 6t + 1)’ dt ) ( (x” — 1)*f(x) dz ) 


«2 


The integrals in the numerator are evaluated in [8]. By the remark at the end 
t j . 

of II, this is also the efficiency of (9.4) relative to the F statistic. This result was 

given by Mood in [12]. 
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THE ESTIMATION OF THE SIZE OF A STRATIFIED 
ANIMAL POPULATION 


By D. G. CHAapman! anp C. O. JUNGE, JR. 


University of Washington and Washington State Department of Fisheries 


0. Summary. The estimation of the size of an animal population is considered 
for a situation where the population is stratified and only partial mixing takes 
place between strata. A consistent estimate is found and its variance deter- 
mined. It is shown that estimates previously given or frequently used in this 
situation are not necessarily consistent and, in fact, may be meaningless. Condi- 
tions for their consistency are determined. Some further statistical problems in 
estimating the interstrata migration are discussed. 


1. Introduction. The use of tagging procedures in estimating the size of animal 
problems is now well known; also the problems of sampling such populations 
so that the procedure conforms to the mathematical models used in the analysis 
of the resulting data have been stressed, particularly by DeLury [4]. 

We recall that the simplest procedure involves marking or tagging some mem- 
bers of the population and subsequently taking a sample of the population, the 
sampling being random with respect to the marked animals. This procedure 
and various extensions of it have been carried out on many populations, par- 
ticularly on a small scale, and it has been customary to assume that where the 
sampling occurred without replacement, the random number of tag recoveries 
would follow the hypergeometric distribution or its various approximations 
(binomial, Poisson, normal). This will certainly be true if each member of the 
population is equally ‘‘catchable” and the capture of one member does not affect 
the chances of capture of others. 

If the experimenter has a large and widespread population under study, it 
is no longer safe to make these assumptions. Thus, populations of fish in the 
ocean are subject to widely different fishing intensities in different areas. Conse- 
quently, if the sample is obtained from such a fishery it is hardly to be expected 
that it will be random, unless there is a complete mixing of the population 
throughout the differentially fished areas in the time that elapses between tagging 
and sampling. 

However, it is known that some animal and many fish populations are made 
up of several groups [“‘tribes” is a word suggested by European fisheries biolo- 
gists; see [1] for example] differentiated by their location. Some degree of mixing 
occurs between these adjacent groups or tribes, either continuously or at in- 
tervals. Hence, during the time interval between tagging and sampling, while 
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the tagged animals are mixing within their groups or tribes, there may also 
occur between-groups mixing. The nature and amount of this is almost certainly 
unknown, and, in fact, the experimenter will often wish to analyze the tag re- 
turns to cast some light on just this aspect of the population dynamics. Moreover, 
in general, it is not possible to state that an animal captured in strata 7 had pre- 
viously belonged to strata 7, though it is usually possible to identify the tags so 
that this information is available for recaptured tagged animals. 

In fact, it is to be expected that in almost all extensive populations of marine 
animals this is the situation the biologist will face, i.e., a population with segre- 
gation by area but some mixing between areas and a population where the 
different subgroups are usually indistinguishable, at least on superficial examina- 
tion. 


Estimation of the population size in a situation of partial mixing of this type 


seems to have been first considered by Schaefer [8]; he dealt with a migrating 
salmon population. An example involving stratification by area in connection 
with the Pribilof fur seals has been noted in [6]. Of the many other possible 
examples we note only two: the important halibut population (see [9]) and a 
smaller flatfish population which has been the subject of a recent intensive 
study [7]. 

In this paper formulae are given for estimating the size of such a mixing popu- 
lation in those cases where tags are put out in all strata. Estimates can be given 
of the population migration between strata as well as of the total population 
size. Asymptotic variances of these estimates are obtained. We also determine 
under what assumptions the Petersen estimate made disregarding stratification 
and the estimate proposed by Schaefer in [8] are valid in the sense of being con- 
sistent. Examples are given to show further that these estimates may give 
meaningless results. 


2. Notation and assumptions. 
N,; = number of individuals that are in stratum 7 at the time of tagging and 
in stratum j at time of sampling, 
ti; number of tagged individuals in stratum 7 at tagging time and in stratum 
j at sampling time, 
number of sampled individuals in stratum 7 at tagging time and in 
stratum j at sampling time. 
number of tagged individuals, tagged in stratum 7 and subsequently 
recovered in a sample from stratum 7 (7, 7 = 1, 2,---7r). 
Sums over any subscript will be denoted by replacing the subscript by a do 
Thus, 
P 


ym t;; = number of tags put out in stratum 7, 


fant 


ni; = number sampled in stratum j. 
i=] 
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The N;;, N.;, and N,, (the total population size) are regarded as unknown 
parameters, the ¢;., n.;, as known parameters, the ¢;; and n,; are unobservable 
random variables, while the s;; are the observed random variables. It is assumed 
that all the parameters are positive. 

Let S be the r X r matrix having s;; in the ith row and jth column, 7 

-,7,j = 1,---, 7; let | S| be the determinant of S, and let 


/ 


n = (%1,N2 
t! 

N) (RRs 

( ; (; 1 Nhe 
Our model may be thought of as consisting of r urns, where urn 7 contains 
N,. marbles of which ¢;, have been marked, a different mark being used in each 
urn. After stirring the marked marbles in each urn, an unknown number of 
marbles (possibly zero) are placed in each of the other (r — 1) urns. After this 
process, N; is the number of marbles in the jth urn and N,; is the number of 
these which were originally in the 7th urn. After again stirring the marbles in 
each urn, a sample of n.; marbles is taken from the jth urn; of these, s;; are 
observed to have been marked originally in the 7th urn. From the known values 


t;., n.;, and s8,;, it is desired to estimate the total number of marbles in the r 
urns, 1.€., 


r ' 
N.. dN LNs, 

It appears to be useful to set out in detail the assumptions that need to be 
made for the determination of any estimate and, in particular, to distinguish 
those which relate to the experimenter’s actions and those which relate to nature. 
The minimum possible assumption that could be made seems to be 


Ni; ti, 
} 


ti; : 
N 


E(8;; | ni; , for all 2, 7. 


j 


This expected value would occur if a random sample is taken within the ijth 
substratum. 

However, a model constructed on assumption I appears to be inadequate to 
yield an estimate of N.., for it involves 3r° unknowns (n,;, ti;, Ni;’s) and 
there are only r’ observable random variables (s;;) plus 2r side ccnditions 
(doing = 1,5, >; ts; = ts.) to determine these. The information is inadequate, 
except in case r = 1, so that it is necessary to make further assumptions to set 
up some structure relating the various substrata. In this respect it is sufficient 
that 


II. E(nyj) = n,; (7s) for all 7,7 with the distribution of ¢,; arbitrary. 


J 
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Assumption II would be satisfied if the n.; marbles are taken from the N_; 
in the jth urn by a random sampling procedure. 
It is seen that I and II together imply 


IIT. E(s;|t;) =n; © for all i,j. 

N 
Assumptions I and III also imply I, but II and III do not imply I. For example, 
consider II and III holding together with E(s,; | ni; , ti;) = 8:;ney + ejyni; . Then 
it is trivial to determine 4,; , ¢:; for each E(nj;) so that this assumption, together 
with II and III, is consistent. Consequently, it follows that III alone does not 
imply I and II and is therefore a weaker assumption. However, in an actual field 
situation it is likely that III will be satisfied only if I and II are. 

It can be seen that assumption III can be satisfied even though no mixing of 
the marbles took place before redistribution; in other words, the validity of the 
procedure does not depend on any assumption on the behaviour of the animals 
after tagging within their respective strata or on the effect of tagging on the 
migration pattern, provided that a random sampling procedure can be used. 

If stratification is disregarded, then the usual estimate of N.. is 


(1) 


although for small s., , 


(2) x (n. + 1)(t.. + 1) 


8. +1 


is preferable (see [3]). 
The estimate proposed by Schaefer in the notation given here is 


@ %- EE (uEY), 
i=l j= of, 9.7 


though in the derivation of this estimate Schaefer found it necessary to add some 
further assumptions. 

An estimate based only on assumptions I and II, or III, is derived by observing 
that we can write 


(4) 


The set of equations 
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form a set of r equations in r unknowns which has a unique solution provided 
that | S| # 0. The estimate of N.. is then simply the sum of the N.; . The solu- 
tion of (5) and the estimate of N.. are most simply expressed in matrix notation: 


(6) (=) = St, 
n 


(7) N; = n/S't. 


3. Consistency of the estimates. The distribution of the s;;, depending as it 
does on the random variables ¢;; , is complex. We can make additional assump- 
tions as to the behaviour of the ¢;; and will do so later. If nothing is assumed 
about the ¢;;, then it appears that it is possible to study only the consistency 
properties of the proposed estimates. 

The property of consistency of estimates based on samples from a finite popu- 
lation has been variously defined. Following one such usage, an estimate N of 
N.. would be called consistent if " = N., whenever all n.; = N.;, i.e., whenever 
the sample taken without replacement exhausts the population. This usage 
makes the definition particular not only to the finiteness of the sample, but also 
to the method of sampling. Moreover, it is certainly satisfied in this problem if 
whenever s;; = ¢;; for all i, 7, N = N.., and it is easy to construct estimates 
that satisfy this condition and are otherwise meaningless. Finally, from a practi- 
cal point of view, in the study of populations that number several hundred 
thousand or several millions, it is unreasonable to think of a sample equalling 
or nearly equalling the population size. 

Yet at the same time it is possible that the samples may be very large—e.g., 
in the study noted in [7] it was of the order of 100,000 or roughly one-fifth of 
the population. With samples of this size, it is to be expected that if the samples 
are random the law of large numbers should be applicable. 

Hence, we consider the case n.;, N.; — ©, n.;/N.; — d;, and 


J 
— Xj ’ 
j 


and say that N is a consistent estimator of N.. if under these conditions 
N/N.. —» ‘. 

This assumption on the asymptotic behavior of the s;; is certainly fulfilled 
if given ¢,; the conditional distribution of the s,;; is multihypergeometric (or 
multinormal). In terms of the sampling procedure, it may be said that this 
situation is to be reasonably expected if the subarea is sufficiently small so that 
the sampling is uniform over it and each member of the population in it has an 
equal chance of capture. Or, again, it is a reasonable assumption if the subarea 
is so small that the N.; members of the population may be expected to mix 
freely and completely, regardless of the sampling uniformity. 

The degree of within-area mixing is a pervasive problem in population esti- 
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mation—this and the effect of tagging on the subsequent behavior of the tagged 
animal. Ultimately, these questions can only be answered by using independent 
methods of population estimation. Still, it will always be important to analyze 
the data from any experiment to determine whether there is internal evidence 
to support the assumptions necessary for the population study. Some sugges- 
tions in this direction were made in [3] for a different tag-sample procedure, 
which might be adapted to this situation. Some further considerations along 
these lines are elaborated on in Section 5. 


Returning, then, to the question of consistency, under the conditions specified, 


ls r 
tends in probability to — Nv. ah," 


No 
N. 


where a; = t,;/t.., \ = n../N. 

For arbitrary i;;, and hence also ¢.;, this ratio equa!s 1 if and only if A, 
\ or n.;/N.; is the same for all strata, i.e., if the sampling is proportional to 
the population size in all strata. 

Now consider N./N... This ratio converges in probability to 


> = tts. 
' ty D tia F*)N 
a N.a 
which equals 1 for arbitrary ¢;;, provided n_;/N.; = \; is constant, i.e., pro- 
portional sampling is required in all strata. 

Turning to N;, it is seen that, substituting E(s,;) for s,; and N.; for N.;, 
equations (5) are satisfied. Hence, if |S| ~ 0, the uniqueness of the solution of 
this set of linear equations ensures that N; is a consistent estimate of N 

It is a trivial exercise in arithmetic to construct examples to show that the 
estimate Ny, may be absurd for all values of the random variables when n. ;/N_; 
varies with 7. One such is given by the case where r 2 t; t Na =Nes 
and all animals migrate from stratum 1 to stratum 2 during the experiment 
except those tagged in stratum 1. Then it is seen that 

(2t;.)? < 21 


bi. + S22 


for all observed 80. . 

This is a pathological example, but perhaps it is not so unreal as might appear. 
Migration to stratum 2 may be normal behavior, but it is quite possible that 
tagging may produce abnormal behaviour such as failure to follow a migration 
instinct. In any case, it seems undesirable to use an estimate that may be mis- 
leading even for larger and larger samples, unless there is good reason for sup- 
posing that proportionate sampling has occurred. 

In the example above, if the estimate N., were used, it would be satisfactory; 


it may be checked that N» is consistent if t;; = 6;;t;. , where 6,;; is the Kronecker 
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delta. While general examples might be given to illustrate the pathological 
behaviour of 1’, (though these, as well as a general study of the sampling proper- 
ties of the estimate, are made more difficult by the presence of the random s;; in 
both numerator and denominator), we content ourselves with a simple numeri- 
cal example. 

Letr = 24. = bh. = ny = nz = 1000; Ni. = 100,000, No. 2000; suppose 
now that all tagged animals in area 1 migrate to area 2 but no others do. On 
the other hand, the area 2 animals distribute themselves in the two areas equally. 
Then Vy 99,000, Nix = 1000, Nox = 1000, Noe = 1000. If sq; is zero, N, is 
undefined; with probability >0.99 however, so; > 0, in which case N. is (again 
with probability >0.99) less than 10,000, whereas N.. = 102,000. 


4. An alternative model. It has been noted that from some points of view the 
operations of tagging and sampling in these experiments are dual operations. It 
might then be thought reasonable to make assumption I plus assumption IJ’ 


E(t t, 7 for all 2, 7 


with the distribution of n,; arbitrary. 
I and II’ together imply a parallel to III, viz., III’: 


E(s;; | nij) 


Assumption III requires that, on the average, in the sample of size n_; from the 
population NV_;, the various tagged groups are proportionately represented. 
The dual assumption III’ makes the same requirement, but treats the tagged 
group as the sample and the subsequent recovery as the property of being marked. 
This appears to be a less reasonable practical assumption in that it requires 
predicting the future behaviour of the animals marked. 

With respect to the urn model, this assumes that the ¢;.-marked marbles are 
completely stirred before any marbles are transferred to the other urns, so that 
in choosing the marbles to be transferred there is no preference for marked or 
unmarked marbles. Although this does not require stirring before sampling the 
N .,; marbles, it does still require that there be no preference for marked or un- 
marked marbles in this final sample. 

It is of interest to note that whereas no effect due to tagging must usually be 
assumed, I and II put no restriction on possible differential migration between 
tagged and untagged fish. Assumptions I and II’, however, do require that the 
migration pattern into the different recovery strata be the same for tagged 
and untagged fish. 

From III’ a set of equations to yield estimates of the N;. can be written down. 
They are 


(8) 
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(9) N; = VY; = t'(S’)'n = n’/St, 


the same estimate as derived in the first model. If we further assume that as 
ti., Ni — ©, t;./Ni — wy (8:;/nij) tends in probability to E(s;; | nj;)/ni; 
t; /N;:. = u;, then N; is a consistent estimator of N... 

However No, N, will not necessarily be consistent estimators of N.. unless 
t; /N; = uw for alli = 1, 2, --- , r. There may be special values of the random 
n:; (or in the earlier model, of the random ¢;;) for which No, Ne are consistent. 
Thus, in this model N> is consistent if n;; = 6;n.; , which would, in general, be 
true only if there were no intermixing (i.e., Ni; = 0 if ¢ * 7). In this case N.. 
is the sum of r separate subpopulations N,; (7 1, 2, --- , r) and the estimate 
N; reduces to the sum of r estimates of the form of Ny). N; also reduces to the 
same form in this special situation. 

It may be thought that Ny would be a consistent estimator, if both n.; and 
t.—3 «© as N;,N.;— © with t;/Ni. > uw, n.;/N.; — d; and if assumptions 
I, II, and II’ hold. No will then be consistent if all u; are equal, or all A; are 
equal. Consider, however, the case where neither of these is true, Assumptions 
I, II, and II’ imply 


wa Nij 
(10) E(8;;) = ti.n.; V. 


tj 
—_— 9 
N..N.; 


and consistency of Ny) under these circumstances will hold in general only if 
Ni;/N;.N.; = constant for all 7, 7. This condition, which means that there is 
random or “independent” mixing between the various strata, will be considered 
in Section 8. It has been noted that equations (5) or (8) may be solved only if 
S is nonsingular. In many cases the nature of the situation dictates that S be 
nonsingular. If stratification is with respect to time of migration and the time 
periods are defined so that an animal marked in any period 7 cannot be recovered 
in a period where j < 7, then s;; = 0 for all 7 < 7. Hence, S is nonsingular pro- 
vided that no s;, = 0; it will thus certainly converge in probability to a non- 
singular matrix. 

In cases of a real migration or mixing, usually N,,; will be much larger than 
Ni; (j # 7), so that again in large samples |S| will be not zero. The fur seal 
study [6] already referred to is an example. 

In general, S’ will converge in probability to 


hy 
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which is nonsingular provided 7 = (t;;) is. The model and limiting conditions 
are those associated with assumptions I and II. If, in addition, II’ is made and 
t:./Ni. — py as ty., Ni. — «©, then S will converge in probability to a matrix 
which is nonsingular if the matrix (N;;) is nonsingular. 

Whether it is possible to construct a test of the hypothesis that (N;;) is non- 
singular remains an open question. One case of singularity which has a simple 
biological interpretation is that associated with the condition 


(11) Ni « NiNi, 
i.e., the random mixing just referred to. 


5. Variance of N’;. To derive the formula for the asymptotic variance of 
N; it is necessary to make additional assumptions on the distributions of the 
random variables involved. To this end we make the following assumption IV: 
The distribution of the ¢;; and the conditional distribution of the s;; given ¢;; 
are multinomial with expectation given by II’ and III. 

It might be more reasonable from a practical point of view to assume that 
these distributions are multihypergeometric. In this case we here neglect the 
finite sampling corrections. 


It is now elementary to derive the variance of each s;; by working with condi- 


] 
N, (n; — 1) (1 pri 


tional expectations. In fact, 


Also, 


sta. th. Naz N 
(13) — —.i%a. %. AV aj Vb; 


~ Nu.Ns.N3;, ’ 
and 
(14) o(Sai%j) = 0, 
From theorems on matrix differentiation, 


an ON; 


0 Sab 


(15) = n’S ‘IlaS't, 


where J, is the matrix with 1 in the abth place and 0 everywhere else. This 
reduces to 


a6) 8. (9,5) (5 S81), 


0 Sab S 1 


J ' 


|S| being the determinant of S, S;; the signed cofactor of 8,; . 
Under the assumptions, 


(17) E(s;;) 
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and substituting /(s,,) for 


Sw , it is readily seen that 


an; Nw No 
(18) —|{ - ) Se + oe 
OSab , la. Np 
Now, as usual, terms of second and higher order in the Taylor expansion of 
N; may be neglected in the limit, i.e., as t;., n.;, Ni. , N.;—> ©. Then the asymp- 


totic variance of N; is easily calculated to be 


¥ y MoMeBs) + o-r~2 (1 — 4 
a ae 1) N N 


(19) 


Na; Nb, 

—2 x X Ve. € 

We write (19) in this form, in which it will be used, although from a mathe- 

matical point of view under the conditions stated, (19) must be infinite. The 

requirements of mathematical rigor can be met by considering the asymptotic 

variance of N,/N... The second term in this variance will be much smaller 
than the first. In fact, for most applications the approximation 


(20) «>> NuNiN 


will be sufficient. 

Since the estimates Vy and N, may not be consistent if proportionate sampling 
(or tagging) does not occur, these are not in competition with N; in general. If 
proportionate sampling does occur, then the several samples may be regarded 
as a single random sample from the whole population. In this case Np is the 
maximum likelihood estimate and hence optimum from an asymptotic point 
of view. 

Assumption IV is much stronger than any of the earlier assumptions and 
opens up the possibility of obtaining maximum likelihood or minimum x° esti- 
mates. However, the modified minimum x estimates ane by the use of 
Lagrange multipliers would require the solution of r° + 2r linear equations. 
Even for r as small as 3 this is hardly feasible for general usage in the absence of 
special computing facilities. Whether these procedures could be simplified or 
other estimates with optimum properties found, is unsolved. 

With Assumption IV it is seen that for the large sample case with n_;/N.; — 
0, t;./N;. 2 0, but n.,t;./N.;, n.;t:./N¢. remaining finite, the estimate N; isa 
maximum likelihood estimate. Even in this situation it is not to be expected 
that an unbiased estimate of N., exists, since such an estimate would have to 
be a nonlinear function of the s;; . That this is in fact so may be proven similarly 
to the situation in the case of a single sample from a single strata as was done in 
[2], by an appeal to the theorem of Barankin [1] on necessary and sufficient condi- 
tions for the existence of unbiased estimates with finite variance. 

When Assumption IV is made, further questions are opened up, viz., what 
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practical sampling procedures may be expected to yield sample observations 
which satisfy this requirement. It was indicated that uniformity of the sampling 
over the subareas or uniformity of behavior within the subareas could be suffi- 
cient to ensure the conditions that 1’; be consistent. Assumption IV, however, 
requires even more, namely, that the migration pattern of the animals be not 
affected by tagging and that the migrations taken by different members be inde- 
pendent of one another. Consequently, it is difficult to visualize a realistic situa- 
tion where there is a priori information that IV is in fact reasonable. The experi- 
menter will wish therefore to make what tests are possible on the data to 
determine if it is consistent with this assumption. One such method of analysis 
is to divide each of the r groups of tags of size t;, randomly into q; sub- 
groups tx (k = 1, 2,--+-, q;). Then if Assumption IV holds, 


, N; 
E (si jx) - tik N.j NuN; - Vij bik N.; 


and for each i, j a x’ test for homogeneity may be made. Acceptance of this 
hypothesis does not confirm the validity of Assumption IV, but it does lend 
support to the analysis based on it. 

On the contrary, if the hypothesis of homogeneity is rejected by the x’ test 
for one or several subgroups, the experimenter will do well to assume that 
formula (19) does not give the correct variance of N; . To derive another formula 
would require making other assumptions as to the distributions involved or as 
to the lower moments of the several random variables. Since there seems to be 
no reasonable alternative information in this direction, we suggest instead a 
method to estimate the variance of N;. For suppose all gq; defined above are 
equal, say, to q (i.e., each group of ¢;. tags is divided randomly into g subgroups 

-, t..). Then utilizing formula (7), we can construct q estimates 

) N® ...,N© of N.. and hence determine an estimate of the variance of 

= 1/q>-3.4N with q — 1 df. 

The choice of the number of subgroups, either for this purpose or for the test 
of homogeneity outlined above, will be strictly limited in practice; for if q is 
chosen too large, the subgroup ¢,;, will be quite small and E(s;;) Ss 
t;4(n.;/N.;) also. There will be many zeros in the matrix S and the estimates 
N will be extremely variable. With some knowledge of the extraneous variance 
and of the degree of migration, it would be possible to set down working rules 
for the choice of g. In many large-scale field experiments, the biologist will do 
well to proceed in this manner until there is adequate assurance that more can 
be assumed about the distribution of the ¢;; . 


6. Additional strata; mortality. It is traditional that even though mortality 
may occur in a population between the time of tagging and of sampling, then if 
the chance of survival is the same among the tagged and untagged animals, the 
Petersen estimate (i.e., No), based on a single random sample frorn a population, 
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is essentially unaffected. In fact, in one case the distribution of the random num- 
ber of tag recoveries remains unchanged. For let 
N, N’ = population size at tagging and at sampling, respectively, 

number of tags at similar times, 

number of tag recoveries, 

sample size. 
If n is large relative to ¢ and hence ¢’, then the usual hypergeometric distribution 
for s is adequately replaced by a binomial distribution, i.e., s|t’ is B(t’, n/N’). 
Further, if ¢’ is B(t, N’/N), then s is B(t,n/N’-N’/N), ie., B(t, n/N), and hence 
s has the same distribution as if there were no mortality. Clearly, it will be 
desirable to establish whether N,; enjoys a similar “robustness.” 

Consider first a somewhat more general situation where there are k + 1 
strata, in the last of which no tags are placed and no sampling is done. We can 
still write E[s;;(N .;/n.;)] = ti; , but it no longer follows that >i 1 E[s:j(N.;/n.;)] 
= ¢;. since > > ; t;; = t;.. Since there are (k + 1)’ essential parameters and 
only k* observations (no additional information is contributed by n.; — Doin 
t;; in each sample), clearly N.. cannot be estimated. Also, certainly, if we do not 
trust the ¢;; to reflect the behaviour of the N;;, the N_; or > , N.; cannot be 
estimated. It is also easy to give numerical examples in the case r = 2 to show 
that even if ¢,;; is assumed to have expectation t;.N;;/N;. , the distribution of the 
8;; may be the same for quite different values of the several parameters. Clearly, 
identification is not possible. 

If, however, sampling takes place in all strata, there will be (k + 1)’ observa- 
tions and the solutions of equations (5) will lead to consistent estimates, pro- 
vided |S| # 0. The case where mortality occurs is related to this situation. 
All strata are assumed to be tagged and sampled. Those dying form the (k + 1)st 
strata so that Ni4;. = 0. Hence, if assumptions I and II’ are regarded as valid, 
then the equations (8) lead to consistent estimates of N;. for Doct Ni; 
Dia ni, = n.;. Hence, N., is also estimable. 

Furthermore, if an analysis, similar to that at the beginning of this section 
for the simple model, is carried out, i.e., consider ¢;; the random survivors of 
t;; with probability of survival N’;;/N;;, then the modified variance of s;; can 
be obtained. The distribution is not identical with the original but the leading 
(and dominant) term in the asymptotic variance formula is unchanged. This of 
course assumes a binomial model for survival as well as the original assumptions 
and limiting conditions used to calculate (19). If any of these fail to hold here, 
replication must be resorted to for variance estimates. 


7. Variable number of strata. In some situations the number of strata will 
change between the times of tagging and sampling. This may occur either where 
the distribution is by area or by time. 

Suppose there are m strata at the time of tagging or marking and r strata at 
the time of sampling or recovery. Consider m > r, with assumption III holding. 
The equations (8) yield m equations in r unknowns. The simplest device is the 
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combination of some of the tagging periods or areas to form a system that has 
a unique solution. Of course, using assumption IV, an optimum asymptotic 
solution could be found by determining the modified minimum x’ estimates. 
If assumption III’ holds, there are r equations in m unknowns, from (11), and 
no solution is possible. 


In case m < r, these conclusions are reversed; i.e., no estimation possible with 


assumption III, estimation possible with assumption III’. In general, it is reason- 
able to expect that either assumptions III or III’ can be made so that estimation 
of N.. is possible. 


8. Estimates of migration. In studies of migration, the N,; will be of interest 
If assumptions I, II, and II’ are made, then N;, and N.; (i, 7 = 1, 2,---, 1) 
are both estimable, estimates being determined as solutions of equations (8) 
and (5), respectively. Also, an estimate of N;; is 


ne sis Ni. N j 


(21) Ni; at 
a. 40.9 


Since (17) now holds, the consistency of N,., N. proven earlier, implies the 
consistency of ’;;. It is necessary to consider the dual limiting conditions that 
ni, n.;, Ni., N.; all tend to infinity. 

If it is permissible to make assumption IV and further, for simplicity, that 
t;./N;. — 0, n.;/N.; — 0, then by methods similar to those used in Section 6, 
the asymptotic variances of the N;., N.;, and N;; may be derived. 

Let 


and na; be the (signed) cofactor of N.; in 7. Then 


(N., 2, NaNeN 
ony av.(%2). 1 x 5 wNeNeNs 
A n\" ab ta. Ns 


(23) A.V. (Ay) _ 3 e > | v nisl" as N..N 4 4 
N\” «a b 


N ij s Ms 
It was noted in an earlier section that the situation 
Ni; « NzN.; 


is of importance both from the biological aspect and from its effect on the esti- 
mation problem. 

Under the restrictions just noted it is easy to construct a test of this hy- 
pothesis. For the s;; will be asymptotically normal and having asymptotically 
zero covariances, and it follows that they are asymptotically independent. 
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Thus, under the restrictions that n_; and t,;, be small relative to N. , an approxi- 
mate test of the hypothesis of complete mixing, i.e., (11), is based on the 
statistic 


op epg a 


where N = et ve. 

If the n_; are considerably larger than the ¢;, and are not small relative to 
N .., this test should be used with caution, for the type 1 error may be much 
larger than the nominal significance level. This is partly due to the fact that 
No is not exactly the modified minimum x’ estimate of N,,. The inflation of x’ 
in (14), caused by the underestimate of o°(s;;), is more serious. The exact vari- 
ance of the s;; contains terms involving N.;, N;., N;,; which cannot be esti- 
mated by the modified minimum x° method. Hence, no asymptotically efficient 
estimates of these parameters exist under the hypothesis. An approximate 
correction may be obtained by estimating the N_; from equations (5), and sub- 
stituting these estimates in (13). 

Similar x’ tests may be employed to test the hypothesis that sampling (or 
tagging) is proportionate in the different strata, i.e., 


nN. ; 


H: N,, 


=A; =A for j fy ay *** oh 

For, under H, E[s;;| ti;] = Ati;so0 that E(s;.) = Xt,;.. Also, using assumption IV 
(actually we may dispense with the assumption that the ¢;; have expectations 
given by II’), for ¢;;small relative to N.; 0 (s;.) = t(\ — \°). Under H wemay 
estimate \ by 


a 


r 


2m ii. 


No N, 


so that the x’ statistic (with r — 1 df.) is 


(25) 


The test for proportionate tagging is obtained by interchanging n and ¢ in 
this formula. 
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DICHOTOMOUS EXPERIMENTS 
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1. Introduction and Summary. It may frequently happen that a researcher, 
wishing to decide which one of a set of alternatives to accept, finds that there 
are several experiments available to him which he might perform to guide him 
in reaching his decision. Thus he is faced with making a preliminary decision 
as to which experiment or experiments he is to perform. If he admits the possi- 
bility of performing more than one, then the question of how many, which ones, 
and in what order arises. It is such questions as these that come under the head- 
ing of comparison and design of experiments. 

While a great deal of general theory of the design problem has been developed, 
e.g., by Wald [1] and Maguire [2], few actual solutions of particular problems, 
especially of the sequential type, have been investigated thus far. Robbin’s paper 
[3] is the first published report dealing with various sequential rules for particular 
nontruncated design problems. The basic purpose of this paper is to investigate 
for certain cases the optimal design, which almost uniformly turns out to be 
exceedingly complicated (see Section 3), and to propose and determine some 
justification for certain simpler criteria. 

Attention is restricted to problems in which there are but two alternatives, 
or hypotheses, H; and H; , and it is required to decide between them with a loss 
of one unit if the false one is selected, while no loss results from selecting the 
correct one. Further, ¢ will denote the a priori probability that H, is true, and 
the basic criterion for comparison will be the Bayes risks associated with the 
various experiments. To say that an experiment is available to a researcher is to 
say that there is a real random variable which he can observe whose distribution 
is known under each hypothesis. 

As an example of a situation in which this type of question may arise, consider 
the problem of deciding between utilizing a use test as against a specifications 
test for acceptance of a lot of manufactured items. A large lot of items has been 
produced and a decision is to be made between, say, w; and w, as being the pro- 
portion of defectives in the lot. Let X = 1 or 0 according as an item selected at 
random is defective or not as determined by subjecting it to a use test. Let Y = 
1 or 0 according as an item selected at random is classified as defective (because 
it fails to meet certain specifications) or not. If a, the probability that a non- 
defective item fails to meet the specifications, and 8, the probability that a 
defective item meets the specifications, are known, then both X and Y have a 
binomial distribution with known parameter under each hypothesis. 

Again, it might be that in the course of a series of treatments of a material 
there are two points at which a certain characteristic may be measured, say 
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the breaking strength of a metal undergoing a series of heat treatments. Let X 
and Y, respectively, denote the value of the characteristic at the different points 
in the process. It is reasonable to assume that under each of two simple hy- 
potheses concerning the process and material, X and Y have prescribed normal 
distributions. 

In general, suppose that X and Y are two real random variables having dis- 
tribution functions of F; and G;, respectively, under hypothesis H; and with 
corresponding densities f; and g; with respect to a common measure, y, such 
that f; > 0 if and only if g; > 0. Let Rz(€) denote the Bayes risk against £ when 
using experiment Z. 

Now the computation and comparison of the risk, Rx(&) and Ry(€), for all 
t, which appears to be necessary in order to obtain an optimal design, is intrinsi- 
cally complicated in most cases, as will be seen in the following sections. Hence 
it is of some interest to investigate some more convenient criteria for choosing 
between experiments. Any such criterion should, of course, dictate the use of 
X if Rx S Ry (ie., Rx(€) S Ry(€) for allé). To be able to check on this is but 
one reason for interest in conditions that Rx < Ry. Another is that whenever 
it is true that Rx S Ry, then regardless of the choice of actions open to the 
researcher or the loss function used, use of X will never yield a greater Bayes 
risk than Y. Also, if a total of, say, n independent experiments is to be per- 
formed, the optimal sequential design is the nonsequential rule: take all n ob- 
servations of X ({4], [5], [8]). 

In Section 2.1 general conditions that Rx < Ry are derived, some of which 
are related to those obtained by Blackwell [5] via consideration of the standard 
experiment. In Section 2.2 the Kullbach-Leibler (abbreviated hereafter as 
K-L) information numbers are introduced, and it is shown, in particular, that 
they provide a criterion which yields an especially simple necessary condition 
that Rx < Ry. The K-L information numbers are also considered as functions 
of that transformation, t, such that the distribution of ¢(X) under H, is the 
distribution of X under H,. The case in which all the distributions involved 
are normal is analyzed in some detail in Section 2.3, where it is seen that the 
K-L numbers do not yield a sufficient condition that Rx < Ry, though they 
do yield a sufficient condition that Rx = Ry . The normal case gives an example 
in which a second criterion, that of being “locally more informative” (Bayes) 
at both zero and 1, yields a condition both necessary and sufficient that Rx < 
Ry. X is termed locally more informative than Y até if & lies in an interval 
[é, , £2] such that on [0, 1] m [&, , &] Rx(€) S Ry(E), with strict inequality at at 
least one end point. This latter criterion is discussed further in Section 2.4. 

The problem of determining the optimal designs, sequential and nonsequen- 
tial, for the case in which all distributions are binomial and a fixed number of 
experiments is to be performed is discussed in Section 3. The complete solutions 
are found to be exceedingly complicated; a few are given. For the sequential 
design, a system for obtaining the optimal design which avoids the complete 
calculation of the successive risk functions was found. 

In the final section, a sequential rule for terminating experimentation is con- 
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sidered, and the problem of finding a sequential design which minimizes the 
expected number of experiments is posed. Two reasonable designs are proposed, 
are shown to be equivalent, and are shown to be better than either of the rules 
which require that all observations be of the same random variable. 

Throughout the paper it will be convenient to consider ¢/(1 — &), and this 
will regularly be denoted by 7. 


2.1. Uniform inequality of risk functions. With respect to a measure y, X and 
Y have densities under H; as shown by 


Y 
(é) H, ‘ 91 
(i = §) Hy, fe 92, 


where f;(z) > 0 if and only if f(z) > 0, and similarly for the g,’s. It is required 
to decide which of the hypotheses is true on the basis of one observation, either 
of X or of Y, with the usual zero-or-one loss. 

Interest in the case of one observation stems not merely from general curios- 
ity; but, as mentioned before, if for one observation Rx S Ry, then for any 
number of observations, any set of actions, and any loss function, consistent 
use of X yields a Bayes risk less than or equal to that of any other design. In view 
of this strong property it is clear that the one-observation case holds some in- 
terest, and that no criterion for choice of experiment should be seriously con- 
sidered which would dictate the use of Y when Rx S Ry. 


(<] 
LEMMA 2.1. Two conditions, each necessary and sufficient, that Rx(€) 4 => Ry(€) 


[>| 


| min (u — 7,0) dE(u) < = | min (u — 7,0) dF(u), 
0 1. | Jo 
t > ) 
. Lie 
il | . n . . n 
(ii) / min (1 “a ,0) dG(u) + = P | min € = ,0) dH (u), 
/0 L | 0 5 Uu 
where E and G are the c.d.f.’s of fo(x) /fi(x) under H, and H,, respectively, and 
F and H are the c.d.f.’s of go(x) / g:(x) under H, and H, , respectively. 
Proor. From the well-known theory of Bayes solutions (see [3], Chap. 6), 
the Bayes risk against ¢ using X is given by 


OM R@=@ ff  f@de)+(1-s [ fle) ave). 


S2(2)/1(2)>€/C—£) f2(2)//1(@) gs t/C—€) 
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With » = &/(1 — &), this can be written as 





Rx(lé 
(2) Hee — / folx) dpa) — 9 | folz) dp(z). 
S S22) /file)s0 S2(z)/filan 
With 
E(u) = | filx) dy(z), 
Salz/filegu 
9 7 ~ 
(3) B® _, _ | wdE(u) — | dE(u) = | min (u — 9,0) dE(u). 
1—¢ /0 /0 “0 
With 
F(u) = / fo(x) dy(x), 
fe(z)lfi(psgu 
> (t ” fe. = 
(4) Rx{t) _ 7 = | dF(u) — 7 1 oF(u) = | min (1 wt ,0) df(u). 
1— E /0 Jo U “0 Uu 


With the analogous expressions for the risk associated with Y, the conclusion 
is immediate. 


( <] 
Lemma 2.2. (i) Rx() 4 = > Rr() if and only if 
Cans 
. ‘ { <) - : 
ax ) dus = | ar (5 ) du, 
Jo 1+ u eee? l+u 
\>J) 


where ax(t) is the probability, under H,, that in following the Bayes procedure 
against § with X, Hz will be chesen. 


cn 
pe 

(ii) If G(u) /u— 0 as u — O, then Rx(E) 9 rR y(£) if and only if 
) 


VIA 


<] 


cs u | > u ) 
animes < eemcemitmenintiade 
| Bx (; + =) ” | | ar(; + u = 
where 8x(t) is the probability under H; that in following Bayes procedure against 
§ with Y, H, will be chosen. 
Proor. From equation (3) of the preceding proof, 


Rx® _ 
:>_ 


V 


(1) 





— [ (u — 9) dE(u). 
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Integrating by parts and noting that E(u) = 1 — ax(u/ 1+ u), 


- R®) _ | ax (; M—) aB(u). 


1 — E /0 — 2 


From the similar expression involving Ry(~), conclusion (i) follows. 

A parallel argument from equation (4) in the proof of Lemma 2.1 and G(u) = 
Bx(u / 1 + wu) yields conclusion (ii). 

THEOREM 2.1. Two conditions, each necessary and sufficient that Rx = Ry, are: 

(i) fo/fi and g2/g; have the same distributions under H, ; 

(ii) fo/f; and g/g, have the same distributions under H: . 

Proor. The sufficiency is immediate from Lemma 2.1. To show the necessity, 
suppose Ry = Ry; then for all 7 = 0, 
(1) / min (u — 7,0) dE(u) = | min (u — 7, 0) dF(u). 

“0 “0 

Now, for any a > 0, let ¢,(u) = n min (u — a, 0) and let y,(u) —n min 
(u — (a + 1/n), 0), n = 1, 2,3,---. Then 


x .@ 


(2) | (dn(u) + y,(u)) dE(u) = 1 (@n(u) + yn(u) dF lu) 


“0 


for all n. Hence, 


E\a) + / (1 — n(u — a)) dE(u) = F(a) 
a<usat+l/n 
(1 — n(u — a)) dF(u). 


a<usa+l/n 


Letting n — ~, E(a) = F(a), 1.e., the likelihood ratios f2/f; and g2/g; have the 
same distribution under H,. It follows immediately that ax() = ay(é), since 
E(u) = 1 —ax(u/1+u). But Rx(—) = tax(—E) + (1 — E)Bx(E); hence, Rx = 
Ry and ax = ay implies Bx = By, which is conclusion (ii). 


2.2. Relations between Bayes risk and the K-L information numbers. With 
these conditions that Rx < Ry, attention is turned to the relation between the 
condition Rx < Ry and the K-L information numbers. 


The mean information per observation of X for discriminating between H, 
and H, when /H/; is true as defined by Kullbach and Leibler [6], [7] is 


(2.2.1) Ix(1:2) = / fix) log ae dy (x) 
— 2 r) 


9 


and 


, > fa{ 
Iz(2:1) = / fo(x) log 7) dy (x) 
% Jil 





OO 
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The mean divergence between H, and H, per observation of X is then defined 
to be 


(2:32) Jx Tx (1:2) + Ig (2:1). 
Ix(1:2) and Jx(2:1) will be referred to as the K-L numbers for X. The K-L 
numbers and the divergence for Y are similarly defined. 


It is noted that if the distribution of X is of the exponential type, i.e., f(z) = 
B(w,)e"™*, then 


Tx a log Blur) + (w, - wo) E,[X]; 
(2.2.3) Bw») 
B(we —_ 
Tx(2:1) = log 3{w2) se (we - w1) E.,.[X]; 
B(w) r 


and 
Jx (w: — w,)(E.,[(X] — F.,[X]). 


Thus, Jx is an interesting measure of the “distance” between H, and H; relative 
to the random variable X, being the product of two often-considered measures. 

If Zx(1:2) = Jy(1:2) and Jx(2:1) = Jy(2:1), then one would say that, in 
the K-L sense, X is the more informative. When considering an a priori prob- 
ability, ~, it seems natural to consider the numbers 


(2.2.4) I x() = £7 (1:2) 7 (1 _ t)T (2:1) 


and J y(€) analogously defined. Then X is the more informative in the K-L sense 
if and only if Jx() = J y(€) for all £,ie., Jx 2 Ty. 

Comparison of Jx(€) and Jy(€) provides a very simple criterion for choice 
between X and Y, especially in the sequential design problem, where after the 
jth experiment one could simply compute the a posteriori probability, &; , and 
choose as the (j + 1)th experiment that corresponding to the greater of Ix(é;) 
and J y(é;). It is hardly to be expected that this rule would be optimal, but it 
will be shown that it will never lead to use of Y when Rx S Ry. This rule also 
possesses some other nice properties described later in the paper. 

TueoreM 2.2. If Rx S Ry, then Ix 2 ly. 


Proor: Again with / and F as defined in Lemma 2.1 
£ , 


(1) | wdE(u) =, fo(x) dy(x) = 1. 
- fol2)/fil2dgn 
Similarly, u dF(u) = 1. Hence, for ¢ any linear function 
/0 


n® 


(2) | o(u) dE(u) = | o(u) dF(u) 


a) 1) 
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By Lemma 2.1, Rx S Ry implies 


(3) | min (u — 7,0) dE(u) < | min (u — 7,0) dF(u). 
“0 “0 
It is easily seen by (2) and (3), then, that for any concave function, ¢, 


wo n® 


(4) I ¢(u) dE(u) s | o(u) dF(u). 

In particular, for ¢(u) = log u, 
(5) Ix(1:2) = — [ log u dE(u) = -[ log u dF(u) = Iy(1:2); 
while for ¢(u) = —u log u, 


(6) —IJx(2:1) = — | ulogudE(u) s — | u log udF(u) = —I,(2:1). 
/0 ~0 
Equations (5) and (6) yield the conclusion of the theorem. 

An immediate corollary is that Rx = Ry implies Jy = Iy, ie., Ix(1:2) = 
Ty(1:2) and 7x(2:1) = Iy(2:1). 

As will be seen in Section 2.3, Jy = Jy does not in general imply Rx S Ry. 
However, in all of the special cases considered by the authors, including the 
standard distributions—binomial, Poisson, Gamma, and normal—it was true 
that Jx = Jy implied Rx = Ry. 

An illuminating view of the K-L numbers is obtained (see Theorem 2.3, 
below) if it is assumed that all the densities under consideration are elements of 
a class, {f,:w ¢€Q}, of densities positive on the same set, and that there is an 
Abelian group, 7’, of transformations of the domain of the f.,’s and a correspond- 
ing group of transformations, 7, of transformations of 2 such that if X has 
density f. , then for t e T, t(X) has density fi) = ul(€)f., ie, dy (("X) = 
u(t’) dp(x). Finally, assume that given w, and w. in Q, there is a t ¢ T such that 
we = t(w). 

As an example of such a class of densities and groups of transformations, 
consider the I distributions 


a 
) 


f(z) = —. 2* *e“"( > 0,2 > 0), 
: I'(a@) 
with T = {t.:t.(z) = ez, e > 0}, u(t.') = e, and f.(w) = ew. 
THEOREM 2.3. Ix(1:2) and Ix(2:1) are functions only of the transformation 
that carries f; into f. and not of f; and f. individually. 


Proor. Choose t ¢ T such that fo(x) = f,(t"'x)u(t"). Then 


. - 7 filz) 7 
1:2) [fce) log f(a) u(t) wie) 


—log ult ') — / filx) log fiz) dy (zx). 
) filt 17) 
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To show that the value of the integral is a function of ¢ only and does not depend 
on f; , choose fy ¢ {f.} and let f,(x) = fo(s ‘x)u(s’). Then, with y = s ‘2, fy = 


1 1 1,- ° 
t's ‘x = s ‘tx and (1) can be rewritten as 


; a - s( , 
(2) 1(1:2) = —log u(t’) + | foly) log fe y) dy(y). 
* Solty) 
A similar proof holds for 7(2:1) and the proof is complete. 
For the example of the [ distributions, if the parameters under H,; and H, 
are w and aw, respectively, the K-L numbers are given explicitly as functions 
of a by 


(2.2.6) Ix(1:2) = a[—loga — 1+ a] 


and 


Iz(2:1) = a log a — 1 +4), 


It can be verified that for this class the relation between 7’ and the K-L numbers 
is one to one. 

Whenever the random variable Y is obtainable from X by a relabeling—more 
precisely, whenever there is a te T such that Y and t(X) have the same dis- 
tribution under each hypothesis—we write Yest(X). In such a case, clearly, 
Ix = Iy and Ry = Ry. 

If, as in the I distributions, the relation between T and the K-L numbers is 
one to one, then Jy = Jy if and only if Rx = Ry, since Ix = Jy implies 
Ye» i(X) implies Ry = Ry implies Jy = Jy. Without the one to one condi- 
tion we have 

THEOREM 2.4. Yes t(X) implies Ry = Ry, and if the likelihood ratios, f./fi 
and g:/g:, are monotone in the same direction, then Rx = Ry implies that 
Yeo t(X). 

Proor. The first statement is clear. Without loss of generality, let X and Y 
have the common density h under H, and densities f and g respectively under 
H,. It then suffices to show that f = g. 

From Theorem 2.1, if Rx = Ry, then 


(1) | h(x) dp(z) = / h(x) dy(z) ‘for ally 2 0. 
S(z)/h(2)g4 o(z)/A(z)S4 


IO’ Sa} = (eit Sn}, 


: < :2 S y2(n)}. 
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Since all densities are assumed positive together, it follows from (1) that y, = 
v2. Hence, 


(2) {a:f(x) < nh(x)} = {x:9(x) S nh(z)}. 


If g(xo) > f(xo), then an 7 can be found such that f(2) < h(x) < g(x), con- 
tradicting (2). If f(xe) < g(zo), then a similar contradiction arises. Therefore, 
f = g. 

2.3. The case of normal distributions. We now consider the case in which 
both X and Y have normal distributions under each hypothesis. Since for normal 
distributions, both the risk function and the K-L numbers are invariant under 
affine transformations, there is no loss of generality in treating the situation 
given by 


r 4 Y 
(é) H, N(O, 1) N(O, 1) 
(1 — £) Hz N(u, o”) N(m, v) 


where » = 0, m = 0, ando’ = ». 

The interest in a thoroughgoing study of this case follows from the central 
role of the normal distribution in statistics, especially in that determination of 
the cases in which 7x > Jy and in which Rx S Ry in terms of the parameters 
u, o, m, and v would be of use in asymptotic comparisons for many types of 
problems. 

The K-L numbers for X are: 


(2.3.1) Tx(1:2) = 3 gE o# —1+ J + “| 
o~ o~ 


and 


Ix(2:1) 4 tog 4, - 1 tots], 


Those for Y are the same, with the obvious substitutions. 
THEOREM 2.5. The following three statements are equivalent: 
(i) Rx = Ry; 
(ii) Ix = Ty; 
(iii) o = vandyu = ™m., 
Proor. By Theorem 2.2, (i) implies (ii) and clearly (iii) implies (i). Assuming 
(ii) to be true, 


o l 
(1) log (“) +4, - 
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Chen if (iii) is false, it must be that o > v 


, which leads to the follow- 
ing absurdities: 


‘ 2 ’ : . =] ,° : ‘ t. 2 
Case I: 0 > 1. Upon adding equation (1) to —v times equation (2), it is 
2. . 
seen that u is of the same sign as 


(3) A(o, v) = (v + 1) log (=) 4. * ~itu- Sse 
But A(v, v) = 0, and d/(d0") A(o’, v) = 1/0’ (0 — v)(1 — o”) < 0. Hence, 
u < 0, a clear absurdity. 

Case II: o° < 1. Starting with the sum of equation (1) and —o” times equa- 
tion (2), a similar argument reaches the absurdity m’ < 0. 

The same line of reasoning establishes the 


Corouiary: For v < o°, Ix(1:2) = Iy(1:2) implies that Ix(2:1) > Iy(2:1). 
For v > 0°, Ix(2:1) = Iy(2:1) implies Ix(1:2) > Ty(1:2). 

For a further analysis of the case of normal distributions, assume » and o° 
fixed, o > 1, and consider the (v, m’) plane. The results of the remainder of 
this section may be summarized by figure 1, where the h, are defined as follows: 








Fie. 1 


hyi(v) = v{log o + (uw? + 1)/o"} — v log (v) — 1, and Jx(1:2) = Ty(1:2) 
if and only if m’ S h,(v); 


ho(v) = w+ o° — logo’ + logv — », and 7x(2:1) = Jy(2:1) if and only 
if m 


S hv); 


2 


ha(v) = (v — v{ o + log . 


o—l1 
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And h; has two equivalent characterizations: 


(i) Let &x be the maximum ¢ for which Rx(¢) = &, and let éy be the maximum 
€ for which Ry(t) = & Then 


( <) 
tx{=;éy ifandonlyif m4 = 
\>) \> 


/ \ / 
(ii) Rx S Ry if and only if v S o and m® < h;(v). 
Rx = Ry if and only ifv = o and m’ = A;(v). 


; hs(v), respectively. 


One of the more curious aspects of this is that no matter how great may be 
the difference in the means, m and yu, under H,, that random variable having 
the smaller variance under H; cannot have a risk uniformly less than or equal 
to that of the random variable having the larger variance. 

That hi(v) S he(v) for v S o° with equality only for v = a, that hi(v) > 
ha(v) for v > o’, and that h.(v) > 0 follow from Theorem 2.5 and Corollary. 

Thus, by Theorem 2.2, a necessary condition for Rx S Ry is that m’ < min 
(hi(v), he(v)]; and for Rx = Ry, that m® = max [h,(v), he(v)). 

We now consider the risk functions, and for simplicity it will be assumed until 
explicitly stated otherwise that v > 1 and o = 1. The probability of the two 


types of errors when using the Bayes procedure based on Y are: 


(2.3.2) ar(t) = 1— Pr ( ¥+— |< ee Vm? + (v = 1) jog a |), 


and 


By(t) = Pr ( y+ — < seas 


: . Vm + (v — 1) log nv | Hn), 


1 
where it is to be understood that ay(¢) = 1 and By(t) = 0 whenever m’ + 
(v — 1) log nv < 0. 

Thus, Ry() = far(E) + (1 — £) B_v(E) = & for m® + (v — 1) log nv < 0, 
i.e., for small ¢. Furthermore, Ry(t) < £ for £ such that log »° > —m’/(v — 1) + 
log v). This latter follows from d/dt Ry() = ay(t) — Byr(), a fact quite gener- 
ally true in statistical games with two states of nature, two actions, and a zero- 
one loss function ([8], Sec. 6.3). 

Together with the analogous results for Rx(£), it is clear that a necessary 
condition that Rx § Ry is that 


- ye + log ot = ei log v, 
o — | v— 1 


or 


( 2 2) 
(2.3.3) m? < hiv) = (v — 1)¢ * : + log 7}. 
= 


v) 
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It may be verified that 


p< 


(2.3.4) hi(v) — ha(v) 


’>a 


} 
a 


Hence, any pair (v, m’) with h,(v) < m* < hv) and 1 S v < o provides an 
example in which X is the more informative in the K-L sense but does not have 
a uniformly smaller Bayes risk. 

Forl Sv so’, the most restrictive necessary condition that Rx S Ry thus 
far obtained is that m? < h,(v). Omitting much of the tedious algebraic details, 
we now show that this is also sufficient. 

LemMa 2.3. For fixed v > 1, Ry() is a nonincreasing function of m for each £. 

Proor: From equation (2.3.2) it follows that, with 


L = a/v /(v — 1) Vm? + (vo — 1) log n*v, 
| Pe . 2 
vei[ B® — ]- 5 ['0[-a(¢- 525) 
Ll—¢ —L -~—)3 


l e L 
ox - 
kasd p| 


Letting asb denote that a and b are of the same sign, 


Y oL 
Ry(é)s so | va Rr _ |= 2. 


om 


E 


LenB -oeeL-a(- ety) 
' " > / ex eee xg —— — ae — —a 
Lom \Vv P 2u 9-1 — . v—1 


where the rather complicated expression denoted here by A can be shown to be 
zero. Carrying out the indicated differentiation and integration in the second 
term and effecting a similar reduction, we have 


a v—1f 1 mv \? 1 / 
(3) —Rr(é =< @ —-y( L — @ ——({L — 
3) om Rr(@)s Vv jexp| 3 (J > v— “) | exp| (1 


Since L = 0, d / (@m)Ry(E) S 0, which completes the proof. 

Now let ¢ be a nonnegative differentiable function of v(v > 1) with ¢(o") = yp’. 
Set m’ = ¢(v) and consider Ry() as a function of v. From equation (1) in the 
proof of Lemma 2.3, 


‘ Ry(é) ) [ 1 | l v | F —4(t—C)2 
2 ‘9, a ith = a i 
(2.3.5) ~/2 (Pe n atte exp om (t — vC)? |dt — n ’ € dl, 


where 


i V o(v) 
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L= Vo 4/ cr + 8 
By a sequence of steps parallel to those of the proof of Lemma 2.3, we arrive 
at the conclusion: 


a dcv—1 Li+ =) ee 
a Pr@s = fo exp ( 7 el 


L 
[ (? — v'C* — v) exp | -2 (t — 00)" | dt, 
p dy 2v 


(2.3.6) 


dc (v — 1)¢’(v) — 29(v) 


dv — 1)*V/4() 

Now take ¢ = hz, in equation (2.3.3); dC / dv = —(1 + vC*) / [2v(v — 1)C]. 
For this choice, the right member of (2.3.6) is nonpositive for all L = 0. To 
show this, consider the right member of (2.3.6) as a function, y, of L for L = 0. 
Now, ¥(0) = ¥(+<) = 0. Hence, it will suffice to show that there is an L* such 
that ¥/(L) Ss 0 for L s L* and y(L) 2 0 for all L = L*. Differentiating y 
and simplifying, 


(23.7) v(L)se"” | =. + ets: +x. 


Let the right member of (2.3.7) be denoted by (ZL). Then 


(23.8) y(L) = & E + 20 (1 ~! +) 41 


and 
vy’ (L) = 40°e*"(L — vC). 


Hence, is concave on the interval (0, vC) and convex on (vC, + ©). But 
vy(0) = 0, y(+0) = +, and 7’/(0) < 0. Therefore, there exists an L* such 
that 7, and hence y’, is negative for L < L* and positive for L > L*. In this 
way the proof is complete for 

Lemma 2.4. For m’ = h,(v), v = 1, Rr(€) is a nondecreasing function of v for 
each ¢. 

Combining the last two lemmas with the fact that for 1 
is a necessary condition for Rx S Ry, while for v = o° 
condition for Ry < Ry, gives 

Lemma 2.5. For1 S v S o, Rx S Ry if and only if m’ < h;(v), while for 
v=o, Ry S Rx if and only if m’ = hav.) 


Since this necessary and sufficient condition that Rx < Ry forl1 Sv <o@° 


vso,m Ss hv) 


Ss 
= h,(v) is a necessary 


2 
> ™m 
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was obtained from consideration of the behavior of the risk curves for small &, 
it was conjectured that their behavior for — near 1 would yield conditions that 
Ry = Ry. 

Lemma 2.6. For v and oa’ both greater than 1, a necessary condition for Ry S Rx 
is that v = o°, and a necessary condition for Rx S Ry is that v S o’. 

PRoor. 


limg +1 (Rv(€)] / (1 — &) = —limy.: Ry(¢) = —lime 1 (ar(E) — Br(€)) = 1, 
and the same relation holds for Rx(¢). Then for 
(1) pg 6 a= SP: 
~~ 
D(n) ~ 0 asn—->+oe. 
The method of proof is to show that for all sufficiently large 7, 
<0 ify <a, 
D'(n) ; 
> @ ifu>o, 
and hence that for £ sufficiently near 1, 


R, (8) awit® forv <o’, 
= nm Vie forv > o’. 


1 dé 





\~ —y/(e2—1)+A , as 12/2 —m1(o— 1) +L l —¢2/2 
*Say(€) a ax(é) 7 eet V/2r° = ani Vv Qn dt, 
where 


L = Vo/( — 1) Vm? + (— 1) logy and 
A =a/ (0 — 1) Vu? + (o? — 1) log n°o*. 
Asn +, 
. L ie — 1) {[>1 fore <o’, 
” AV @=Det lei forv>e, 


and both L and A — +. Hence, for v < o, L — A— + and for all suffi- 
ciently large 7, 


m m a Ht 
- (--3)- 4-23 +2)>( xi ghz ta). 


and it follows that D’(n) < 0. On the other hand, if v > o, A—- L> +; 
and the same reasoning shows that for all sufficiently large n, D’(n) > 0. 
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It has, then, been completely determined, for v > 1 and o” > 1, precisely 
when Rx S Ry and precisely when Rx = Ry, namely, 


TuHEoreEM 2.6. For v > 1 ando’ > 1, 

(i) Rx S Ry if and only if v < o and m’ S h;(v), and 

(ii) Rx = Ry if and only if v = o° and m’ = h,(v). 

The case v < 1 and o < 1 may clearly be converted, by interchange of H, 
H,, to the above case; whereas if v and o° lie on opposite sides of 1, neither 
Rx Ss Ry nor Ry S Rx can hold, since for v < 1, Ry(€) = 1 — € on an inter- 
val (& , 1). 


2.4. Locally more informative experiments. The notion of being “locally 
more informative” arises naturally from the example just analysed. We will say 
that X is locally more informative than Y at ¢ if & lies in an interval [§ , &] such 
that on [§ , &]n [0, 1], Rx(€) S Ry(t) with strict inequality at one end point. 
The example of the normal distributions provides an instance in which a neces- 
sary and sufficient condition that Rx S Ry is that X be locally more informa- 
tive than Y at the two points 0 and 1. 

For binomial distributions, the form of the risk curves makes it immediately 
clear that locally more informative at 0 and 1 is equivalent to uniform inequality 
of risks. Thus, if the parameters are given by 


A Y 
A, Pi qi 
H, De q2 (p2 > mn), 


then, in terms of the parameters, Rx S Ry if and only if 


Bm. &,,>l-n l—-p ne Pr | 
Pr a, ee ra 


ae -p>2>1—-P 


— - q1 [= 

Another instance in which locally more informative at 0 and 1 is equivalent 
to uniform inequality of risks is the case of the [ distribution (equation 2.2.5). 
From this it is easily found that if the parameters are given by 


X r 
Hy, Wi 6; 
H, We 6. ° 


then Rx S Ry if and only if w/w, = 62/0; = 1 or w/w, S 62/0, S 1. 

The conjecture that if all the distributions belong to the same exponential 
family, then being locally more informative at zero and one is equivalent to uni- 
form inequality of risks, may be shown to be false, however, by the example in 
which dy(r) = e 7/2" dz. 


3. Optimal designs for a binomial testing problem. If we now consider the 
case in which a total of n experiments is to be carried out, the question arises 
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as to how difficult it would be to obtain, for a specific case, the optimal sequential 
designs. If they are practically obtainable, the interest in any other design cri- 
teria which have some justification though not optimal is reduced to pure curios- 
ity. It is found, however, that this is not the situation. For the case of binomial 
distributions, the optimal designs for small n were found and are given below 
for a few of the cases in order to illustrate the intrinsically complicated structure 
of the optimal designs. 

Suppose that X and Y have binomial distributions with parameters given by 


- : 
HA, p q 
H, q Pp, 


that the observations are independent, that the total number of observations to 
be taken, n, is fixed, and that the cost of observation is independent of the true 
hypothesis, of the random variable observed, and of the outcome of the ob- 
servation. 

The complete form of the solution for the case n = 3 will now be exhibited. 
The solutions are expressed in terms of [0, 1] on the 7 axis corresponding to sub- 
intervals of [0, 3] on the & axis, which by symmetry may be extended to [0, 1] 
on the & axis. (J denotes indifference.) We only present here the best optimal 
design of the first experiment for the case n = 3. The description of the optimal 
designs for the remaining steps will be omitted. 

(3.1) For p(1 — p)’ > q(1 — q)’ and p(1 — p)® < q(l — q)’: 


optimal choice : Pee’ ae eee Be EE LPS Y 


q PT ee 
0 
7 (2) 


For p (1 — p)’ > g(1 — gq)’ and p(l — p)’ > q(l — q)’: 


¢ql—q) qi-q 1-p 


2) ; 
(2 p(l— p)’pi-p) l1—-—@q 





optimal choice: J : X : JT : X : }T : X 
q\ gq’ g(l—p) gqil—q gil — gq) 
. « OO sass 
P \p pil — p) p(l — p) p(l — p)’ 
For p(l — p)’ < g(1 — gq)’ and @(1 — gq)’ < pl — p)’: 
optimal choice: J : X : TIT : X Re as 
q\ q\ gq(l—q) qqi-—4q) 
. 0 @® @ mea 
P P p(l—p) p p(l — p) 
For p(1 — p)’ < g(1 — g)’ and (1 — q)’ > pl — p)’: 
epGeaniebees: 2153 2-38 223 <tBee Sein 3) ot 


oo ® & mS 
p p i—¥¢ p p(l — p) 
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where 


_(—-pPd-p-g-—qi-® 
G+ gu ~9-@-Fe-f 
4. A nontruncated design problem. In the preceding sections attention has 
been fixed entirely on design problems in which the sample size was fixed and in 
which we were interested in reducing the Bayes risk. In considering nontrun- 
cated procedures, any design will, if only it calls for a sufficiently large number of 
observations of relevant random variables, reduce the risks from the terminal 
decision below any arbitrary level. What we do here is fix a risk, r, and consider 
certain designs with a view to minimizing the expected number of experiments 
required to reduce the risk from the terminal decision to at most r. If the ex- 
periments have equal cost, this is equivalent to reducing the risk to at most r 
in the most economical way. 
One idea in considering the nontruncated problem below was to create a 
symmetric problem in the hope that more satisfactory results could be obtained. 
Suppose, as in the previous section, that X and Y have binomial distributions 
with parameters given by 
X Y 
) Mh q 
i=-@ hm ¢ Pp (p > 0, p(l — p) > g({l — q)). 


The problem can be stated as being that of minimizing the expected number 
of observations required to move the a posteriori probability for H, to a position 
in either [0, r] or (1 — r, 1]. 

Three reasonable rules are considered and shown to be equivalent, and are 
shown, by a more general theorem, to be better than always using either one of 
the experiments. 

Let & = & and let £; denote the a posteriori probability for H, after having 
made the first 7 observations. It will be convenient to consider the problem in 
terms of the variable y = log n = log é / (1 — £). Then let 


a = log P . 
qd 


and 


seen there are two random walks, both on the y-axis with boundaries at A and 
—A, one of which is determined by the results of observations of X and the 
other is determined by the results of observations of Y. After having made j 
observations one finds that the walk has arrived at the point 7; . Now the choice 


Let n denote the smallest value for which either §; S r or §; = 1 — r. It is 
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must be made as to whether it is better that y,;,, should be determined by an 
observation of X or of Y, i.e., whether the next step should be taken in the 
X-walk or in the Y-walk. A rule is desired prescribing for every situation which 
walk should be taken in order to minimize the expected value of n. 

If at the (7 + 1)th step X is observed, the expected movement in the walk is 


(4.1) Exlyj41 —_ 73] = f;J x << Tx(2:1). 


Since Jx is positive, Ex\y;4; — y,] is an increasing function of y; and is positive 
fory; > —vy* = log [Jx(2:1)] / Jx. 
Similarly, if Y is observed, the expected movement is 


(4.2) Eylyj41 ps vi] — EjJ x — Tx(1:2), 


which is again an increasing function of y; and is positive for 7; > y*. 

It can be verified that y* > 0 and hence, for y; > O(&; > 4), the X-walk 
yields an expected step greater in magnitude than the Y-walk and in the “right”’ 
direction, i.e., toward the nearest boundary, A. For y; < 0, the Y-walk enjoys 
the same advantages, — A being the nearest boundary. 

These considerations lead to the conjecture that, at least for a small relative 
to A, the optimal design may be to take the X-walk on the (7 + 1)th step if 
7; > 0 and to take the Y-walk otherwise. (If p(l — p) < q(1 — q), the same 
results will hold with X and Y interchanged.) 

Another reasonable rule would be to use X on the (7 + 1)th step whenever, 
starting from 7; , observation of X consistently to termination of experimenta- 
tion has a smaller expected number of observations than consistent observation 
of Y. Let X,, denote the rule requiring X at every step, let E[n| X,, vy, Hi) 
denote the expected number of steps in the X-walk with £ as the starting point 
when H;, is the true hypothesis, and let 


E{n | Xo ’ ) = tE[n | Me »7> H;| + (1 _ E)E[n | Xe 7% H). 
Using Wald’s well-known approximations, we find that 
E|n | Ya, ¥) ce E{n Xe, y) 


| 


Tx(1:2) — Ix(2:1) | a ad 
= - a d a am © A —<$——— —  ___ 
I x(1:2)Ix(2:1) \é ‘ 7 74 — ] 


\ 


(4.3) 


cs me e+? 
+ (1 — é) |4 ~7- ua 
ef— 1 j 
Pre. 1 — a ane ft? - 1 
-_—— A(2t — ob Oe et Oe Cie he ee 
s¥(y) y + A(2 — 1) + 2A —— 2A | ——y + , eg 


¥(0) = ¥(A) = 0 and, at least for A = log 3, ¥’” (7) is positive and then nega- 
tive as y increases from 0 to A. Hence, ¥(y) = 0 on [0, A] if ¥/(0) can be shown 
to be positive. Now, 


(4.4) y’(0)s4 + &4“(A — 2) + 2Ae* — 2Ae* — & “(A + 2). 
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At A = 0, the right member of equation (4.4), together with its first three 
derivatives with respect to A, is zero, while the fourth derivative is positive for 
all A = 0. Therefore, ¥/(0) > 0 for A > 0, and it follows that 


(4.5) E(n| Y,.,7]) — Eln|X,,7]) > 0 for y > 0; 
and by the evident symmetry, 
(4.6) E[n| Y,,7] — Eln|X,,7) <0 for y <0. 


Thus, the design which requires the use of X on the (j + 1)th step if y; > 0 

and of Y if y; < 0 coincides with 
(i) Choosing the random variable which has the largest expected movement 
and that toward the nearest boundary, and 

(ii) Choosing the random variable corresponding to the smaller of 

E[n | X,, vs] and E[n| Y,,, y;). 
It is easily seen that it also coincides with 
(iii) Choosing the random variable corresponding to the larger of Jx(£;) and 
I y(;). 
Of course, in view of the fact that we have used the Wald approximations, the 
description in (ii) should be understood as being only approximately valid. 

Let the design thus characterized in four ways be denoted by M. This section 
concludes with a general result which shows that M is better than either X 
wr. 

By a stationary design we shall mean a design in which the choice at the 
(j + 1)th step is a function only of the a posteriori probability after the jth 
step, &;. 

THEOREM 4.1. Let X and Y have densities f; and g; , respectively, under hypothesis 
H; such that both log fe/fi and log g2/g: assume positive and negative values with 
positive probability. Let D, and Dz be two stationary designs, and let D be that design 
which requires, at the j + 1th step, the random variable corresponding to the smaller of 
E{n | Dy , yj) and E{n | Dz , y;). Then E{n | D,y] S min {E[n| D, , |, E[n | De, y)}. 

Proor. Let T; = {y: for y; = y, D; requires X at the (7 + 1)th step}. With 


__&fi(X) 
Efi(X) + (1 — &)fo(X) 


a 


and Tx(y) = log ae 


_ T'x(&) : 


Tx (é) _ 


(1+ E,[n|D:,Txr))], ifyvely, 
(1) E(n|D;, 7) =41 + E,[n|D;, Tr(y)), ifyz2T;, 
\0, if |y| > A. 


By some simple probability considerations it is easily scen that for any sta- 
tionary design D’, E{n | D’, y| is uniformly bounded in y. 
Define the function H and the set 6 by 


(Bin | Dy, y] for y ¢ 9, 
(2) H(y) = min E[n| D;, vy] = 4 Eln| Dz, v7) for y 29, 
: 10 for |y| > A. 
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Then with T (Tn 8) u (T. — 8) (which represents the region according 
to the design D where D requires X), we have 


G(y) = H(y) — E(n|D, y) 


(E,(n| D: Tx(y)) — E,|n|D, Tx(y)], vel, 
E,(n| Dj Ty(y)) — E,yln| D, Ty(y)I, yer, 


0, ly| > A, 
/ ; . . . 
where D; and Dj are the appropriate designs. It is also clear that Dj calls for a 
continuation of 7'x(y) when ¥ is in T, and a similar result is valid for D} . From 
(2) and (3), we obtain that 


‘EjG(Tx(y)), vel, 
G(y) 2 4 E,IG(Ty(y)), vel, 

i 1 

\0, ly| > A. 


It suffices to show G(y) 2 0. Let y, denote a sequence of points where |y,| < A 
and such that inf, G(y) = lim,.. G(y,). It follows from (4) that either a sub- 
sequence yn, + log fo(x) / fi(x) exists such that 


‘ , ° . fo(x) 
inf G(y) = lim G(yn, + log—— 
7 ni f(x) 


with probability 1, or a subsequence y,, of y, exists such that 


inf G(y) = limG@ ( Ya 


7 mi; 


with probability 1. As both likelihood ratios are less than one with positive 

probability, there exists a choice \(A > 0) such that either lim; G(ya, — A) = 

inf G(y) or lim; Gym, — ) = inf Gy). Let us again denote the sequence which 

fulfills this last requirement by v*” so that lim, G(y—?) = inf G(y). Repeating 

the argument a finite number of times, we arrive at a sequence y; < —A and 

lim, Gy’) = 0 = inf, G(y), or G(y) = 0. 
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TABLES OF EXPECTED VALUES OF ORDER STATISTICS AND 
PRODUCTS OF ORDER STATISTICS FOR SAMPLES OF SIZE 
TWENTY AND LESS FROM THE NORMAL DISTRIBUTION! 


Epirep By D. TEICHROEW? 
Numerical Analysis Research, University of California, Los Angeles 
1. Summary. Tables of the means, variances, and covariances, to five deci- 
mal places, of order statistics from samples of size ten or less have been given 
‘by Godwin [3]. In this paper basic expected values are given of order statistics 
and products of order statistics, for samples of size twenty and less to 10 dec- 


imal places (DP).* In addition, certain other functions are tabulated to 25 DP 
to facilitate extension to larger sample sizes. 


2. Introduction. Let 2, , t2,--- , ty be a sample from N(0, 1) arranged in 
order of size so that 


mi Saf °** S ay. 
The means, variances, and covariances of these “order statistics’? may be ob- 


tained from the following expressions for expected values and product moments: 


. N! 
Bai) = Gy =m 
N! 

(3 — DUN — 9) 
N! 
G—-Dig—-i—)KN—»! 


B(j — 1,N — 9), 


E(23;N) = Dj -—1,N — 9), 


E(x; 2;; N) i Gui — 1,N —jjnmt— 1), 


where 


B(m,n) = [ af(x)(F(x)|"(1 — F(z)” dz, 


D(m,n) = [ a'f(e)lP I" — FGI ae, 


G(m, n, p) = / / ryf(x)f(y (F(a) "11 — F@I(Fy) — F(a)) dz dy, 


* 
—z2/2 
¢ / 


fa)=Fe, Fe =f sou. 
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* The letters “DP” will be used for ‘“‘decimal places.’”’ Values to 20 DP are available; 
see the concluding paragraph of this article 
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Interest in numerical values of order statistics from norma! samples has arisen 
from three main sources. The first is the need for “normalizing” data so that 
techniques for the normal distribution can be used for the analysis. Fisher and 
Yates [2] give the means of the order statistics for N < 50 for this purpose. The 
second is the search for a less laborious way to compute an estimate of the stand- 
ard deviation. The range, which is a function of order statistics, was suggested 


for this purpose but, as is well known, it loses efficiency rapidly as the sample 


size increases. Other functions of order statistics have been studied and it has 
been shown by Godwin [4] that a linear function of the ordered observation can 
be obtained which will for N < 10 estimate the standard deviation with effi- 
ciency at least as great as 0.9883. Cadwell [1] has examined the efficiency of 
quasi ranges, W, , where 


Vv r Trai — Tyr 5 


and combinations of the form W, + AI, , where \ is an arbitrary constant. 

The last and most recent interest in the order statistics has arisen from the 
study of censored samples. In this field it is natural to consider ordered observa- 
tions because it is usually the larger or smaller values which are missing. Papers 
in this field are by Gupta [5] and by Sarhan and Greenberg [10]. 

The practical usefulness of the proposed estimates is usually judged by the 
efficiency which is obtained by comparing the variance of the estimate with the 
variance of the optimum statistic. Most of the estimates are linear functions of 
the order statistics and their variance can, therefore, be computed if F(z; ; N) 
E(x; ; N), and E(x;x;; N) are known 

The first attempt to compute these quantities was undertaken by Hastings, 
Mosteller, Tukey, and Winsor [6]. They give for N < 10 the variances to 5 DP 
and the covariances to 2 DP. They mention the difficulty in computing these 
values and state that more accurate values are required. 

Jones [7] evaluated the variances and covariances for N < 4 in closed form. 
Godwin [3] extended these results to N < 6 and computed by numerical in- 
tegration the variance and covariance for N < 10 to 5 DP. The variances (and 
higher moments) of the extreme were computed by Tippett [12] and have been 
given by Ruben [9] to9 DP for N s 50. 

In 1948, while computing efficiencies of various linear functions of order statis- 
tics, Professor W. J. Dixon found that the number of decimal places given in 
tef. [6] was not sufficient for practical purposes. He also came to the conclusion 
that the tables should be extended beyond N = 10. He proposed the task of 
computing the means, variances, and covariances for N S 20 to the National 
Bureau of Standards. Computation was started at its Institute for Numerical 
Analysis in Los Angeles in 1949 under the sponsorship of the Office of Naval 
Research, USN. This paper presents the report on that project. 

In view of the large number of people who have contributed to this project, 
it seems desirable to present briefly the history of the computation and to give 
proper credit for the work. 
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The original method of computation consisted in integration of B, D, and G, 
as defined above, using 10-digit floating-point arithmetic and an integration 
interval of 0.25. The results of this computation were not satisfactory, and in 
1950, J. Barkley Rosser, who was then director of the Institute for Numerical] 
Analysis, studied the problem extensively and proposed the method of calcula- 
tion which was followed. This method is outlined in Section 3. Rosser, in a Na- 
tional Bureau of Standards report, gave a method for computing moments of 
order statistics and gave numerical values for the E(x; ; N) and E(x} ; N) for 
N up to 21 to 18 digits. He also gave constants required to compute the first 8 
moments for N up to 20. These computations were carried out to the full ac- 
curacy (about 24 DP) obtainable from Sheppard’s table of the normal probability 
integral [11]. 

The problem of loss of accuracy in the combination of integrals was studied 
in detail by A. D. Hestenes and by differencing the Sheppard table it was found 
that the y’s (defined below) could not be computed accurately enough from that 
table. It was, therefore, necessary to produce a more accurate table of the normal 
probability integral. Coding was started on a triple precision (108 binary digits) 
interpretative routine for the SWAC (Bureau of Standards Western Automatic 
Computer). This routine could handle addition, subtraction, multiplication, and 
integration; it had been coded but was not completely checked out by July 1, 
1954, when the Institute for Numerical Analysis ceased to exist as a section of 
the National Bureau of Standards and became the Numerical Analysis Research 
Project at the University of California, Los Angeles. Sponsorship by the Office 
of Naval Research continued. 

In addition to J. Barkley Rosser, a larger number of people contributed to the 
project during the period when it was administered by the National Bureau of 
Standards. Among those who should be mentioned are the following: A. D. 
Hestenes, who directed the project for a number of years; E. C. Yowell, who su- 
pervised the computation and coding; G. Blanch, who directed some of the check- 
ing; and M. Howard, 8. Marks, O. Mock, and A. Rosenthal who did the compu- 
tation and coding. 

The author completed the problem and, of course, assumes full responsibility 
for all statements made about the accuracy of the results. The E(x; ; N) and 
E(x; ; N) values were given by Rosser in his report and are published here with 
his gracious permission. Rosser did not publish his paper because he felt it would 
be better to wait until the covariances were computed. He now expects to im- 
prove the theoretical parts of his paper before publishing it. 

This paper is to be regarded only as a report of the result of some lengthy 
computations. No attempt was made to use these results to answer statistical 
problems or to determine whether the computations should have been carried 
beyond N = 20. However the y(a, b) values are listed to permit the calculation 
of some variances and covariances beyond N = 20. Some additional variances 
can also be computed by the differencing method mentioned by Ruben [9]. In 
view of the increasing difficulties of the computation as N increases, it would 
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seem advisable to investigate techniques such as the one used by Cadwell [1] 
before attempting further computations. 


3. Computation and Checks. The means and variances are relatively simple 
to compute since they require merely the evaluation of one-dimensional integrals 
in which the integrands vanish at both ends of the range of integration. The values 
given in Tables I and II were computed by the method developed by Rosser 
[8]. Rosser’s method is applicable to higher moments; for the means and vari- 
ances, it is equivalent to the formulas given by Godwin [3]. The numerical in- 
tegration was performed on values of the normal integral obtained from Shep- 
pard’s table [11]. The computations were done in 1952 and 1953, using mainly 
punched-card equipment, under the direction of A. D. Hestenes. The values 
were checked by sum checks. In addition, the D’s appear in the computation of 
the G’s, and the B’s appear in the checks for the G’s. 

The computation of the covariances is more complicated because it involves 
either the evaluation of a large number of double integrals G(m, n, p) or the 
more accurate evaluation of a smaller number of double integrals ¥(a, b), which 
requires more elaborate integration formulas. Godwin [3] has given a method for 
the computation of the B’s, D’s, and G’s which is based on two basic sets of 
integrals, y(a) and (a, b), where 


v(a) -| (F(2))*t1 — F(a)]* ax, 


ae 00 
Y(a,b) = | F(x)\* dx fl — F(y)|’ dy. 
 - de 
The method of computation used in this project is analogous to Godwin’s; how- 
ever, it was developed independently by Rosser. All of the computations were 
performed using fixed-point, triple-precision arithmetic on the SWAC. 

The steps in the computation were as follows: 

(1) The normal density function f(z) was computed for x = —12.00(.02)0. 
At least 27 DP are accurate. 

(2) The distribution function F(x) was obtained by numerical integration, 
using Everett’s central difference formula (rewritten in terms of ordinates) with 
terms up to and including §”: i.e., 24 ordinates were used in each sum. 

(3) H(z; b) = fil (oOy dt, for b = 1(1)19, was obtained by the same nu- 
merical integration routine as that used in (2). 

(4) y(a, b) = f®.[F(t))"H(—t, b) dt, for a = 1(1)19, was obtained by multi- 
plying ordinates and summing. 


me D(n + 1, m) Dinsm+1) Wn+i1,m+4+1) 

(5) G(m,n,0) = oeapai een ; 
2(n + 1) 2(m + 1) (n + 1)(m + 1) 

(6) G(m, n, p + 1) = G(m, n, p) — G(m + 1), n, p) — G(m, n + 1, p). 


N! . - . 
- ~— Gi -1,N —j,j -—i-— 1). 


(7 ee ee 5 
DD Be RD eG it~ a 
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The formulas in (5), (6), and (7) are equivalent to Godwin’s [3] formula 4, 
page 281. His formula is useful for computing an isolated H(2;x;); ours are 
useful if all the (a;x;) are needed. It may be noted that for N S 20, (a, b) 
is required only for a + 6b S 19; the extra values were computed for checking. 

The following checks were used: 

(1) The normal density function checked at intervals of 0.1, with the values 
given by Sheppard [11]. This, together with the fact that y(1, 1) turned out to 
be 3, to 27 DP, leads us to believe that f(z) is correct to at least 27 DP. 

(2) The above remarks apply also to F(x), in addition, F(0) 4, to 27 DP. 

(3) Our faith in H(z; b) depends on the accuracy of (a, b). 

(4) The ¥’s must satisfy the formula 


v(a)y(b) = 2>> >> (-)" (7)(°) via + u,b + »). 
— Ps u U 

As mentioned above, the ¥(a)’s used for the left-hand side were computed on 
IBM equipment using values from Sheppard’s tables and are therefore completely 
independent of the ¥(a, b). In the computation of the right-hand side, individual 
terms could be computed to only 26 DP. The identity was satisfied to within 1 
unit in the 25th DP. We can therefore conclude that the ¥(a)’s and (a, b) are 
correct to 25 DP for7z < 9, a,b S 18. (Unfortunately, y(a, b) for either a or b 
19 cannot be checked until the values for a or b 20 are also available.) An 
additional check is that obtained from the symmetry relation ¥(a, b) = y(b, a). 
This check was satisfied to more than 25 DP. 

(5) Once they(a, b)’s have been checked, the problem becomes one of applying 
the recurrence relations correctly. This part of the computation can be checked 
by the formula 


1 


G(0,0, 2p) = > (-)'(*?) Bp — i,0)BG,0) 4 (—)” (“?) [B(p, 0). 
This identity was satisfied to 25 DP. It may be noted that G(0, 0, 18) is a func- 
tion of all G(m, n, 0) and also of all G(m, n, p) for p = 17, and the correctness of 
G(0, 0, 18) verifies the correctness of all G(m, n, p) except possibly for mistakes 
in punching. This source of error was eliminated in the next check. 

(6) The final values are checked by the identity 


a E(a; 2;3;N) = 1. 


tel 


The sum includes the variances, which satisfy the identity 


a E(x;2;;N) = N. 


These two relations are satisfied to within one or two units in the 20th DP. 
(7) In addition, for N < 6, the closed-form expressions as given by Jones [7] 
and Godwin [3] were computed to 32 DP and used as checks. 
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4. Tables. Table I gives E(x; ; N), and Table II gives E(z;2; ; N), rounded 
to 10 DP. Missing values may be obtained by symmetry relations, e 
(i) E(a; ; N) — EB (ay_i4n 3; N) 


Oo 
| a 


(ii) If N is odd, the center order statistic has its mean equal to zero 

(ili) EH (a,x; ; N) E(2z;,2:;N) 

(iv) E(a;2; ; N) E\(ty—i41 , tv-j4i 3; N). 
Table III gives ¥(a) and Table IV gives y(a, b) rounded to 25 DP. Again, miss- 
ing values can be obtained by symmetry 


(v) ¥(a, b) = y(b, a). 


The values given in Tables I and II were computed to 20 DP; they are given in this paper 


to 10 DP mainly because this should be sufficient for most purposes. The complete table 
can be obtained on 925 punched cards by writing the Director, Numerical Analysis Re 
search, University of California, Los Angeles 24, Calif., requesting table number 5024. A 
charge of three new cards for each one used will be made. A typed version of the tables can 
be borrowed from UMT of MTAC by writing the Editor, Mathematical Tables and Other 
Aids to Computation, University of California, Los Angeles 24, Calif. A similar arrangement 
holds for table number 5023 which gives on 1180 punched cards the values for f(z) and F (x) 
for z = —12.00 (.02) 12.00 to 27 DP. Tables of [F(z)}* and H(z; k) fork = 1(1)19 and zx = 


-12.00(.02) 12.00 are also available; correspondence regarding these should be sent directly 
to the author 
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TABLE I 
Expected values of order statistics 


E(x; , N) | E(xi , N N E(x; 


.56418 95835 ‘ f .31224 88787 |} 17 5 | .61945 76511 
.84628 43753 ; .10258 96798 | 17 | 6 .45133 34467 
.02937 53730 3 .66799 01770 | 17 .29518 64872 
29701 13823 3 ‘ 16407 71937 || 17 14598 74231 
16296 44736 3 3 84983 46324 18 $2003 18790 
19501 89705 3 { 60285 OO882 1S ; 35041 37134 
. 26720 63606 3 } 38832 71210 18 ‘ 06572 81829 
64175 50388 3 19052 36911 18 é .84812 50190 
.20154 68338 70338 15541 || 18 f 66479 46127 
83756 d 20790 22754 | 18 5 50158 15510 
42706 3 90112 67039 18 35083 72382 
35270 69592 d 66176 37035 18 .20773 53071 
42360 03060 é f 45556 60500 18 .06880 25682 
85222 48625 26729 70489 19 84448 15116 
47282 24949 08815 92141 ; .87993 84915 
.15251 43995 .73591 34449 3 .09945 30994 
.48501 31622 | lL : .24793 50823 88586 19615 
93229 74567 f ’ .94768 90303 : .70661 14847 
.57197 07829 71487 73983 5 .54770 73710 
.27452 59191 f E 51570 10430 40164 22742 
.53875 27308 f j 33529 60639 | { . 26374 28909 
.00135 70446 : 16529 85263 ¢ .13072 48795 
.65605 91057 ) .76599 13931 86747 50598 
37576 46970 5 j .28474 42232 .40760 40959 
12266 77523 5 : .99027 10960 ; .13094 80522 
58643 63519 d 76316 67458 d 92098 17004 
06191 65201 } f 57000 93557 E .74538 30058 
72883 94047 ) 39622 2755 ) .59029 69215 
16197 83072 23375 15785 | .44833 17532 
22489 08792 } .07728 74593 8 | .31493 32416 
62922 76399 .79394 19809 .18695 73647 
11573 21843 2 31878 19878 .06199 62865 
79283 81991 3 .02946 09889 
53684 30214 d .80738 49287 
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TABLE II 
Expected values of products of normal order statistics 

j E(x; xj ; N) Nii j | E (xq xj; N) | N i La | E(x, x; ; N) 

— stinstintahaattparesiemntieaepaceaedlaiaia, Minette aap — ina — - 

1} 1.00000 00000 | 8} 1] 1| 2.39053 49747 A 1} 6} —.13035 66254 
1) 1.00000 00000 | 8) 1| 2) 1.39053 49747 | 10) 1) 7 | —.52928 83257 
2] 1.00000 00000 | 8/1] 3 .79907 62783 10) 1| 8| —.96842 82811 
1| [1.27566 44477 | 8/1] 4 .31184 25735 lio |1| 9 | —1.50680 02399 
2| .27566 44477 | 8/1) 5| —.14235 45216 | 1| 10 | —2.34106 10315 
3| —.55132 88954 | 8} 1] 6 | —.61290 27315 ry 2} 2] 1.21724 00737 
2 44867 11046 | 8) 1| 7 | —1.16492 90242 |10| 2] 3 .80357 20248 
1 | 1.55132 88954 | 8/1) 8 | —1.98980 25239 |10|2| 4 .48797 62226 
2| 55132 88054 | 8|2| 2| 96568 82621 |10|2| 5| 21257 70424 
2 | —.14772 81323 | 8| 2] 3 .56614 69584 10/2) 6 | —.04863 46765 
4| —.95492 96586 | 8/2) 4 25323 98948 |10| 2| 7| —.31404 67778 
2 44867 11046 | 8|2| 5 | —.03241 18438 |10/2| 8) —.60464 26847 
3 14772 81323 | 8|2| 6| —.32422 86175 |10!2| 9| —.95934 47746 
1| 1.90002 04360 | 8,2) 7) —.66304 06044 | 10/3) 3 .60541 68331 
2 -80002 04360 | 8|3)| 3 42432 99017 | 10/3 | 4 .38032 60958 
3 .14814 77252 | 8|3| 4 .22447 06701 |10) 3 | 5 .18822 18294 
4| —.46991 74988 | 8/3] 5 04885 15166 10/3 | 6| — .00874 81054 
5 | —1.27827 10984 | 8|3| 6 | —.12574 39762 }10|3| 7) —.17160 54566 
2 .55656 27332 | 8|4) 4| .21044 68615 |10|3| 8 | —.36738 03049 
3 .20843 54440 | 8/4/ 5| .12591 48488 | 10) 4)| 4 29913 80219 
4} —.00510 11144 | 9/1] 1) 2.56261 74183 |10| 4) 5 .17360 31403 
3 .28683 36616 | 9) 1| 2| 1.56261 74183 | 10|4| 6| .05969 16062 
1 | 2.02173 90604 | 9} 1| 3 97012 95851 }10| 4 | 7 | —.05225 20049 
2| 1.02173 90694 | 9) 1] 4 49898 17432 |10/5| 5 .16610 12814 
3 -39483 66863 | 9/1) 5|  .07274 22354 | 10, 5| 6 .11055 15903 
4| —.15297 20858 | 9/1) 6 | —.34819 14908 }11|1| 1| 2.85002 77414 
5 | —.73587 22832 | 9| 1) 7| —.80030 77349 |11/1| 2| 1.85002 77414 
6 | —1.54947 05060 | 9| 1) 8 | —1.34438 03016 |11|1| 3| 1.26861 57614 
2 69142 72690 | 9) 1| 9 | —2.17420 88731 | ll} 1| 4 .81841 62399 
3|  .31832 96521 | 9|2| 2| 1.09487 54256 }11) 1) 5 42562 33724 
4 .01032 03642 | 9/2) 3 .68736 32588 | 11} 1) 6 .05720 07586 
5| —.30504 40716 | 9/2) 4| .37294 55079 | 11/1] 7/| —.30839 96597 
3|  .28683 36616 | 9|2| 5 09344 77394 | 11; 1| 8| —.69165 68331 
4 .14265 16716 | 9| 2| 6 | —.17939 36730 }11| 1| 9 | —1.12114 69906 
1 | 2.22030 41356 | 9| 2) 7| —.47001 14367 ||11| 1 | 10 | —1.65524 31199 
2| 1.22030 41356 | 9} 2| 8| —.81746 39387 |11| 1| 11 | —2.49346 50118 
3 60903 83042 | 9} 3| 3|  .51353 31898 }11| 2| 2| 1.33286 42755 
4 .09848 68607 | 9/3) 4) ‘000 areas |i 2| 3] .91427 62554 
5 —.40036 28885 | 9/3) 5|  .11376 80176 }11} 2) 4] — .50773 16558 
6 | —.96418 63986 | 9/3| 6| —.06365 82663 |11|2| 5 2525 83655 
7 | —1.78358 41490 | 9| 3) 7 | —.24901 53959 | 11/2] 6 07192 05024 
2 .83034 86720 | 9/4) 4 .24592 33257 | 11| 2] 7 .17792 83736 
3 44161 45034 | 9/4) 5 .13699 13669 |} 11| 2] 8| —.43863 19457 
4 13072 98656 | 9/4) 6 .03730 27039 | 11|2| 9] —.72971 16588 
5 | —.16517 61670 | 9} 5| 5|  .16610 12814 | 11| 2] 10 | —1.09056 36980 
6 —.49363 46110 }10|/1| 1| 2.71210 37899 | 11| 3] 3 .69693 11658 
3 34412 37617 | 10) 1 | 2| 1.71210 37899 |11| 3] 4 .46367 52869 
4 16555 98429 |10|1| 3| 1.12577 18388 |11| 3] 5 . 26655 00636 
5 .00520 26434 | 10} 1 | 4| .66645 83784 |11/ 3] 6 .08551 78832 
4 21044 68615 | 10) 1 | 5| — .25949 67065 | 11| 3] 7] —.09143 52295 
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$7845 
.36137 
. 22376 
. 100038 
.01901 
14087 
19021 
11674 
04861 
13716 
2 97801 
1.97801 
1.40064 
95770 
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11947 
— .46770 
83919 
1.26121 
1.79198 
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1.44212 
1.01949 
70216 
43189 
18424 
05500 
— .29717 
— .55469 
— .84647 
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78657 
54683 
34582 
16355 
— .01121 
— .18712 
— .37334 
— .58355 
.42801 
28119 
. 15023 
.02616 
— .09754 
22753 
. 22811 
.14165 
06166 
— .01660 


06666 
18709 
86128 
99938 
{6585 
SIS895 
$8129 
69879 
$9805 
T6885 
24335 
QOS96 
Q0896 
18721 
81654 
29501 
53113 
98445 
36483 
55827 
26489 
71742 
05793 
29110 
71285 
89575 
07057 
85733 
35423 
48037 
83245 
59479 
75731 
10977 
5975A 
33989 
98631 
56206 
13766 
70932 
66485 
13701 
74136 
90816 
74273 
90483 
$3422 
30981 
47372 
16395 
20662 
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TABLE II—Continued 
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.13716 
09786 
3.09739 
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52340 
O64 1 
71318 
37261 
O468SS8 
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.61230 
— .97461 
39066 
91878 
76375 
54548 
11947 
80149 
53292 
28959 
O5804 
17146 
40813 
66340 
.95595 
32667 
87361 
62859 
$2447 
24120 
06792 
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- 27856 
.46735 
.68315 
.49643 
. 34235 
. 20584 
07801 
.04713 
.17494 
.31169 
. 27404 


a= 


fila 


08904 
.00336 
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.15461 
.10168 
-05212 
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99407 
66149 
66149 
67031 
29990 
92732 
38249 
33088 
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31771 
57510 
20939 
35177 
64087 
87851 
86969 
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13392 
SS669 
57285 
74891 
32366 
38591 
83375 
39000 
06037 
26962 
04964 
59303 
82354 
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77916 
91289 
5225 
94108 
43058 
27845 
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.51105 


03701 


2 SS84188 
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21455 
89599 
62879 
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51204 
76565 
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. 14535 
~ .02200 


19040 
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.77769 
56515 
-40517 
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. 13349 
.00719 
.11927 


. 25063 
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. 32464 
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12521 
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85490 
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S6980 
46103 
16865 
56899 
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65052 
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&3466 
95664 
L5A75 
73868 
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69979 
55279 
95612 
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15141 
39030 
$6326 
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40043 
11045 
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07284 
58138 
77133 
71294 
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59995 
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.11970 
.06039 
.00245 
.11679 
08753 
31443 
31443 


74582 


. 31802 


.95779 
.63469 
33225 


.03957 


- .25204 


.55108 


.86769 
21655 
62362 
14777 
99828 
73646 
30507 
98602 
71995 

48276 


. 26168 


.04842 
16355 
38050 
60984 


- .86220 


. 15632 
.53462 
.03884 
78569 
. 57688 
. 39208 
. 22070 
.05601 
. 10720 
. 27386 
.44968 
.64283 
86760 
.63328 
.46839 
. 32386 
.19076 
.06351 
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TABLE IIl—Continued 
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89950 
66786 
70586 
70586 
84742 
46931 
69592 
79582 
18229 
36673 
03438 
35331 
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26558 
11465 
39615 
17812 
34988 
20833 


72992 


26573 | 


68106 
92701 
38833 
24188 
99695 
62515 
53501 
39995 
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67484 
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47969 
32453 
74184 
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.06206 


— . 18987 


92433 


47169 


37781 
. 26743 
. 16683 


07143 


- .02212 


11683 
21603 
21829 
14689 
08014 


.01543 


- 04944 


.13001 
.09004 


.05235 


.10169 


3.41373 
2.41373 


84731 


42314 


-06794 
.75138 


45735 


.17550 


.10195 

38202 
.67219 
.98198 


32575 
.72935 
. 25197 


3.10489 


.82496 
.39138 
.07191 
.806577 
.57185 
.35451 
. 14679 
.05723 
. 26280 
ATHAS 
.70227 
.95365 
24851 
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91293 
67942 
82747 
95126 
74644 
31303 
37346 
31681 
21836 
30234 
47915 
OO871 
22656 
07559 
42647 
10103 
52951 
99964 
02295 
46521 
54094 
54094 
12087 


99069 | 


02886 
84716 


36462 | 


84595 
11721 
22664 
06141 
37155 
02291 
17286 


62116 | 


68626 
17979 
59986 
00598 
53414 
42866 
43842 
55938 
07939 
91651 
74045 
00971 
51101 
19137 
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~ 1.62998 

1.11697 
. 86061 
64998 


- .§2983 
72482 
- .95327 
70028 
53126 
38373 
24869 
. 12065 
00432 
12962 
25872 
39590 
- .5A749 
.43226 
31667 
.21178 
11300 
.01708 
- .07867 
17698 
28112 
25803 
. 18008 
10744 
03753 
- .03176 
- .10247 
.15204 
10368 
05793 
01325 
10169 
07905 
3.50776 
50776 
-94326 


FOO 


17139 
86039 
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91758 
HADAS 
26454 
26930 


O4477 
89274 
94941 
26425 
60638 
70563 
73058 
92373 
35264 
20754 
15226 
61178 
37162 
66569 
32978 
09182 
81070 
23745 
39395 
86399 
34551 
18433 
66926 
22446 
23697 
86623 
04100 
70169 
15039 
42139 
3993 
24618 
42572 
54921 
33917 
16521 
57704 
08345 
08345 
81594 
66799 
83159 
87122 
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.57337 
. 30036 
.03414 
.23135 
.50210 


~ .78494 


.08900 
42843 
. 82904 
35036 


3.20550 


.90932 
.47382 
. 15396 
. 88962 
.65661 
.44236 
. 23913 
.04139 
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.56521 
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. 33608 
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.53491 
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.10781 
. 26569 
.43021 
.60670 
80335 
.03505 
76587 
59330 
44322 
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02504 
41055 
40122 
O8789 
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90086 
26533 
55966 
31645 
13557 
86076 
12827 
29569 
66310 
02659 
27797 
96759 
28192 
06731 

25332 
21375 
25546 
60200 
86697 
01008 
07252 
05306 
96577 
52167 
94144 
91772 
55487 
56686 
04144 
07197 

27090 
02164 

57182 
72428 
28054 
64848 
90118 
06544 
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.07014 
. 19540 
.32570 
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29393 
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TABLE II—Continued 
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.48712 
36715 
25870 
.15715 
.05931 
.03730 
.13505 
. 23648 
. 34489 
30058 
. 21720 
. 13980 
.06574 
00698 
08021 
15588 
18003 
12491 
07281 
02217 
02837 
.11204 
.O8080 
05114 
09004 

3.59704 

2.59704 
.03427 
.61609 
26897 
96275 
68166 
41601 
. 15896 
.09496 
35079 
.61383 

- 89051 
18969 
.52536 
.92337 
44355 

3.30074 
.98991 
.55267 
23247 
.96883 
73744 
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23190 | 
82192 | 
49498 | 


89814 
34507 
87706 
72491 
67988 
48775 
02158 
43470 
87085 
20328 
42736 
59665 
38072 
96202 
82402 
29753 
54074 
32128 
24403 
84927 
00267 
76686 
65814 
61702 
61702 
65308 
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51889 
09008 
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42554 
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71599 
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37283 
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81181 
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35893 
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69926 
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20810 
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47854 
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.17553 00021 | 19 
-09809 30677 || 19 
.02260 96299 | 19 
.05260 93423 \ 19 | 
.12923 63965 19 | 
.20919 60677 19 
.21210 34702 | 19 
.15139 92491 19 
.09415 84056 | 19 


.73123 82925 119|} 6) 1 .02134 96632 
.95274 96978 511 .09877 04413 
20223 32941 5 1 1 .17836 20745 
49908 24322 .26205 43418 
88835 35110 | 24693 38145 
.33451 10082 .18154 56358 
07070 03889 .12004 79540 
85591 60662 .06082 97030 
66910 59363 .00259 85845 
.03878 84044 | 19 49962 04556 - .05582 13320 
01603 11780 || 19 34112 70806 11565 43329 
- .07156 68585 1193] ¢ 18934 87555 .15239 43086 
12965 00217 | 19 .04103 65629 .10850 51122 
09146 89512 | 19 10659 12312 06669 58229 
.05509 64106 | 19 - 25623 38217 .02595 95147 
.01955 77328 || 19 41086 52514 01458 51042 
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12082 70910 || 19 89222 54987 3.76315 97146 
.70514 62523 | 19 71386 54325 |: ; 2.76315 97146 
36134 13030 || 19 55969 65138 | ; 2.20334 06878 
05927 66874 || 19 .42044 02316 | .78989 83552 
78329 22736 | 19 29064 30231 | 44904 08118 
52386 95189 | 19 .16666 14590 |; 5 .15063 48881 
27445 15628 | 19 04575 76598 87909 21502 
02996 34144 | 19 .07438 76891 .62502 36800 
21401 84895 | 19 .19600 29194 .38206 78460 
.46185 59408 | 19 32152 30588 | .14543 27711 
71841 79950 || 19 .45396 68711 — .O8889 26380 
- 98983 53826 | 19 .59756 57833 |: ‘ 32465 42591 
.28478 86665 | 19 .75905 79000 ‘ 56576 49939 
61718 28951 | 19 59609 42648 81678 
01289 86080 19 46912 18881 | —1.08365 51006 
53209 20354 | 19 .35508 47179 j 37491 30326 
3.39117 37939 | 19 24925 07652 .70441 51707 
06701 59055 | 19 .14849 89037 | 8 2.09809 45742 
62823 66660 | 19 05051 41639 | 20 | 1 | 1 61641 25315 
30772 81241 | 19 .04663 86868 ’ 3.47725 83786 
04467 74622 | 19 .14479 56323 ||: 2 | .14092 24544 
81469 35030 | 19 .24594 14349 |: : .70074 14812 
60527 67432 | 19 35251 67925 || .37995 34182 
40891 41917 || 19 46792 44975 |; § .11742 89274 
22048 14684 | 19 39000 52335 ; 5 .88867 43301 
.03604 90040 | 19 29818 53404 ‘si 7 .68118 45644 
14777 85386 | 19 | 6 3 21348 33614 |/20/2)| 8 48750 54400 
- .33432 31434 |19/ 6] § 13323 26697 | 20 | : ¢ 30263 12966 
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.05502 
. 23379 


11646 


- 60652 


80845 
02871 
27777 
57521 
96663 
40185 
13608 
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.73304 


56384 
10630 
25621 
11046 
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17809 


32570 


.47916 


64209 
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95288 


77212 
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56801 
34563 
98104 
56628 
17036 
53688 
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E(x; xj ; N) 


.61628 48185 
$7597 85179 
34573 32605 
. 22193 82389 
10194 2O857 
01641 62785 
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925616 84132 
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29597 10292 
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3560 15809 
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.56418 95835 47756 28694 80786 
.09900 37940 91548 44183 91722 
.02015 46833 80170 42508 13857 
.00435 75542 71626 37697 33560 
.00097 35535 89558 23860 49576 
.00022 20555 84024 11010 03862 
00005 13748 33363 61937 65236 
00001 20104 83187 26023 22116 
.00000 28302 16707 39166 75167 
.00000 06711 15270 89017 24362 
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.09058 
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- .06569 


. 14506 


- 297296 
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.13013 
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06608 


63337 
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00797 
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96896 
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07728 
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50395 
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Table of w (a, b) 


.50000 
.19550 
.11216 

07565 
.05580 

04357 

03538 

02956 
.02525 

02194 
.01933 
.01723 
.01550 
-01406 
.01285 
.O1181 
.01091 
.01013 
.00944 
.05015 
.02132 
-01138 
.00693 
.00460 
.00324 
.00239 
.00182 
.00143 
00115 
.00094 
.00078 
.00066 
.00056 
.00048 
.00042 
.00037 
.00032 
.00720 
.00319 
.00165 
.00095 
.00059 
.00039 
.00027 
.00019 
.00014 

00011 


00000 
11094 
77761 
$2297 
73499 
46097 
45804 
98892 
68141 
71151 
76592 
43142 
74682 
75480 
08246 
08168 
28876 
07382 
{0684 
71620 
62753 
67208 
43673 
22186 
55846 
45873 
94811 
71193 
47563 
54804 
65051 
31883 
57969 
76700 
41332 
18327 
83152 
90995 
21562 
92448 
98747 
95944 
68460 
47966 
73157 
59658 
07018 


00000 
77885 
$4551 
25579 
43013 
63369 
94334 
89549 
86726 
27269 
35855 
41634 
05471 
43521 
58215 
84164 
60934 
94693 
$7665 
45802 
79221 
07328 
96898 
91894 
02798 
85965 
31165 
77785 
6723 
45993 
56560 
39184 
93061 
79783 
83094 
14978 
02436 
48137 
49199 
05188 
31303 
74933 
29506 
72790 
57472 
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00000 
32095 
98762 
61175 
90255 
19083 
80517 
34717 
$1207 
20600 
66560 
87962 
55219 
95895 
88637 
58441 
70642 
15624 
48610 
97767 
91857 
55199 
01837 
56707 
21587 
23952 
13734 
89338 
94265 
23095 
O9801 
56285 
89389 
07253 
95640 
89632 
92123 
29975 
39221 
72636 
13651 
32285 
83813 
92694 
13119 
21653 


7245 


O0000 
55017 
21684 
38235 
21453 
54940 
52981 

07091 

16719 
60009 
$1992 
30710 
43757 
85675 
11122 
63317 
54955 
57146 
45567 
98872 
50860 
98228 
19221 
40120 
66344 
58542 
14246 
99162 
26607 
16447 
74959 


61034 
01059 
18455 
43507 
21206 
65961 
17817 
06090 
02138 
18810 
31343 
00407 
65464 
05727 
59531 
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.00008 57494 20745 29354 41430 
.00006 76374 31474 50937 57570 
.00005 41981 35043 80420 32791 
.00004 40328 83934 46386 89237 
.00003 62132 28079 75117 99695 
.00003 01072 80833 83210 31975 
.00002 52753 96848 34490 95730 
.00120 76001 82167 82352 26105 
.00054 77852 11336 04939 03183 
.00028 10310 06320 11589 21149 
.00015 76600 19741 38803 68994 
.00009 46764 05000 10155 78414 
.00005 99853 69007 17101 90664 
.00003 96902 11683 28478 11889 
.00002 72194 71775 97915 22185 
.00001 92375 16174 56717 72636 
00001 39495 48566 95416 17691 
.00001 03414 61725 18665 97688 
.00000 78159 05356 05535 22392 
.00000 60081 58453 34511 61894 
.00000 46884 59596 75746 80849 
.00000 37080 29284 86122 61652 
.00000 29681 47498 09241 60418 
.00022 04352 81924 64868 68374 
.00010 16119 70163 35119 22870 
.00005 17462 48306 80576 76418 
.00002 84461 98484 02076 92270 
-00001 66157 36979 80989 52093 
.00001 01966 85130 01531 28746 
.00000 65193 63290 35670 04499 
.00000 43150 18311 90904 55499 
.00000 29418 67127 84332 51278 
.00000 20577 79587 34297 33162 
.00000 14720 09436 42762 11034 
.00000 10740 09945 28233 99675 
.00000 07975 08312 69926 99696 
.00000 06015 70599 45933 50524 
.00000 04602 34345 95088 33995 
.00004 25243 42919 47828 83531 
.00001 98283 69359 42453 27161 
.00001 00513 90400 16675 62086 
.00000 54466 38456 73171 00400 
.00000 31169 02107 77339 59195 
.00000 18667 24719 96959 55595 
.00000 11619 46631 41218 49052 
.00000 07476 11073 65175 47454 
.00000 04950 58796 83923 47338 
.00000 03361 94671 97672 93207 
.00000 02334 58274 02965 95783 
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.00000 01653 68708 77040 26861 
.00000 01192 42176 55398 09875 
.00000 00873 73236 45446 11208 
.00000 85263 15970 45567 62570 
.00000 40100 97465 04941 91140 
.00000 20265 99629 82427 26539 
.00000 10865 07155 13142 50967 
.00000 06120 37985 00067 05510 
.00000 03595 78188 47635 74603 
.00000 02190 51850 55752 58827 
.00000 01377 23312 93196 02647 
.00000 00890 26140 99352 28361 
.00000 00589 80231 05959 92095 
.00000 00399 41744 31029 13351 
.00000 00275 87288 77426 51883 
.00000 00193 96542 56110 57069 
.00000 17590 42868 28919 40610 
.00000 08328 88124 97958 28581 
.00000 04200 22631 65991 36053 
.00000 02233 38303 77997 47451 
00000 01242 45102 77378 15424 
.00000 00718 70359 09459 99756 
.00000 00430 15369 82998 01350 
.00000 00265 30223 60700 01871 
00000 00168 05142 13726 33829 
.90000 00109 01922 50194 42794 
.00000 00072 25796 62989 05235 
.00000 00048 83175 48637 35200 
.00000 03709 61389 15222 86735 
.00000 01765 94102 88575 28247 
.00000 OO889 17735 02404 15631 
.00000 00469 72562 05520 73927 
.00000 00258 67365 43949 17592 
.00000 00147 72632 94183 79478 
.00000 00087 11889 78661 48239 
.00000 00052 86627 59071 61725 
00000 00032 91278 73157 81782 
.00000 00020 96874 99904 65848 
.00000 00013 64148 37299 24226 
.00000 00796 01729 04764 44595 
.00000 00380 64032 71975 08928 
.00000 00191 43513 61245 61666 
.00000 00100 59508 83756 36067 
.00000 00054 93334 86740 22140 
00000 00031 03648 84426 10042 
.00000 00018 07512 14271 91775 
.00000 00010 81707 19966 49114 
00000 00006 63448 61871 47304 
.00000 00004 16088 73981 22512 
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00000 
QOO000 
00000 
OOO00 
QOO000 
Q0000 
QO0000 
00000 
GOO000 
00000 
. 00000 
.00000 
00000 
00000 
00000 
QO000 
00000 
00000 
00600 
00000 
QO000 
00000 
QO0000 
00000 
QUOO000 
QO000 
QO000 
00000 
OQ0000 
QO000 
00000 
00000 
.00000 
00000 
00000 
00000 
00000 
00000 
00000 
.00000 
00000 
.00000 
.00000 
.00000 
.00000 


00173 
OQOOS3 
00041 

00021 

00011 

00006 
00003 
00002 
00001 

00038 
OOO18 
00009 
00004 
00002 
00001 

00000 
00000 
QOOOS8 
00004 
00002 
00001 

00000 
00000 
00000 
00001 

00000 
00000 
00000 
00000 
00000 
Q0000 
00000 
00000 
00000 
QO0000 
00000 
00000 
60000 
00000 
00000 
00000 
00000 
00000 
00000 
00000 


27384 
15615 
78423 
86071 


rane 
85372 


63617 
82338 
26072 
36863 
16117 
37120 
22465 
80842 
59164 
$3952 
$2164 
18074 
48820 
09726 
05618 
06844 
57288 
31603 
17891 

90409 
92124 
16211 

23947 
12782 
07009 
13027 
20860 
10459 
05407 
02874 
09785 
04752 
02382 
01229 
02237 
01088 
00545 
00514 
00250 
00118 


86156 
35388 
30143 
81467 
23133 
77190 
97530 
77303 
22260 
27130 
27936 
26120 
22868 
95414 

02115 
91287 
38746 
90858 
88769 
54654 
58427 
72363 
35816 
30211 

96774 

85117 
04276 
60524 

39493 
09859 
27493 
01282 
71110 
72128 
98777 
31923 
54436 
29028 
11411 

95808 
67316 
57053 
39826 
59182 
76445 


31589 
09490 
32371 

O8134 
22632 
91648 
22544 

26465 
79079 
13665 
28823 
70723 
05723 
97468 
97277 
93300 
06020 
46154 
$4710 
90472 
48192 
36801 

41955 
80750 
61853 
15084 
26404 
37080 
37525 
73929 
21081 

43064 
66198 
71865 
38563 
99285 
40160 
10025 
40161 

85931 

53745 
31324 
97744 
84180 
88778 





































ESTIMATION OF LOCATION AND SCALE PARAMETERS BY 
ORDER STATISTICS FROM SINGLY AND DOUBLY 
CENSORED SAMPLES 


Part I. The Normal Distribution up to Samples of Size 10 


By A. E. SARHAN AND B. G. GREENBERG 
University of North Carolina 


1. Introduction. Type II censored samples [3] are considered, whereby the 
total number of the sample elements is known but the observations for some of 
the extreme elements are missing. Singly censored samples are those in which 
only the smallest r; observations or the largest r. observations are missing, 
whereas samples having both r; smallest and rz largest observations missing are 
called doubly censored samples. This general case of estimation includes, as 
special cases, those estimates obtained from singly censored samples as well as 
those obtained by taking all the sample elements (i.e., 7; or r2 = O andr; = r 
0). 

The approach to the general case in censoring is of value not only for its nu- 
merical results. It enables the drawing of inferences concerning interesting and 
important patterns for the coefficients, variances, and the relative efficiencies 
of the estimates. These features could not be and were not revealed in the earlier, 
less general studies. These conclusions will be considered in Section (5). 

In Part I, estimation of the mean and standard deviation from singly and 
doubly censored samples drawn from the normal distribution will be considered 
for samples n S 10. A generalization of an alternative estimate previously given 
by Gupta is also obtained. In future work, it is planned to extend the tables up 
to samples of size n S 20 and to include the two- and one-parameter single- 
exponential distributions. 

Estimates of the parameters using the best linear systematic statistics are 
obtained by arranging the known sample elements in ascending order (i.e., 
Yay S Ye S +++: S Yo) and applying the method of least squares to get the best 
linear combination of them. The coefficients provided for these linear estimates 
of the ordered observations make them unbiased with minimum variance. The 
method used in calculation is identical with that given by Gupta in [8] with 
slight modifications. 


2. The normal distribution. To advance the study of order statistics for the 
normal distribution, Hastings et al. [4] calculated the means, variances, and co- 
variances of the order statistics up to samples of size 10. Godwin [1] calculated 
these quantities more accurately as well as extending them to more decimal 
places. From his tables, he was able to calculate the best linear systematic statis- 
tic of the standard deviation [2] using all the sample elements for samples of 
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TABLE I 
Variances and covariances of order statistics in samples of sizes up to 20 from a 
normal population 


Value é j Value 

.6816901139 

.3183098861 3 ‘ . 2197215626 . 1705588454 
.6816901139 . 1655598429 | 5 | .1369913669 
.5594672038 ‘ . 1296048425 ) .1126671842 
. 2756644477 4 . 2104468615 | ¢ { . 1661012814 
. 1648683485 ‘ .3728971434 . 3443438233 
.4486711046 : . 1863073997 2 | .1712629030 
.4917152369 ‘ . 1259660300 < . 1162590989 
. 2455926930 .0947230277 : .0882494247 
. 1580080701 é .0747650242 ‘ .0707413677 
. 1046840000 ) .0602075169 ) .0583987134 
. 3604553434 .0482985508 .0489206279 
. 2359438935 : .0368353073 é .0410844589 
.4475340691 ‘ ‘ . 2394010458 : .0340406470 
. 2243309596 3 . 1631958727 .0266989351 
. 1481477252 . 1232633317 y 4 2145241430 
. 1057719776 : .0975647193 3 . 1466226180 
.0742152685 .0787224662 .1117015961 
.3115189521 d 0632466118 € 0897428245 
. 2084354440 d 3 . 2007687900 ) .0741995414 
. 1499426668 { . 1523584312 .0622278486 
. 2868336616 { 1209637555 0523067222 
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.4159271090 .0978171355 ¢ 0433711561 
. 2085030023 . 1871862195 < q . 1750032834 
. 1394352565 , . 1491754908 1338022448 
. 1024293940 ] . 3973080204 t 1077445336 
.0773637839 ’ .1781434240 j .0892254012 
.0563414544 3 . 1207454442 .0749183943 
. 2795777392 0913071400 0630332449 
. 1889859560 .0727422354 . 1579389144 
. 1396640604 j 0594831125 f . 1275089295 
.1059054582 .0490764061 } . 1057858169 
. 2462125354 : .0400936927 .0889462026 
. 1832727978 { .0310552188 § . 1510539039 
.3919177761 ; : 2256968778 ) . 1255989678 
. 1961990246 3 1541163526 .3332474428 
.1321155811 { .1170056918 ; . 1653647712 
.0984868607 f .0934477394 : .1123584351 
.0765598346 j .0765461431 d .0855170596 
.0599187124 .0632354695 f .0688483064 
.0448022105 { .0517146091 ) .0572007586 
. 2567328862 3 : . 1863826133 .0483754063 
.174483327 : .1420779776 .0412423472 
1307298656 § .1137680176 .0351103357 

1019550089 j 0933625386 .0294198503 

0799811748 7 .0772351806 11 .0233152868 
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. 2051975798 
. 1403096511 


1071492595 || 


.0864430257 
.0719305024 
0608869662 
0519504506 
.0442549455 
.0371029977 
. 1657242880 
1269672925 
1026407291 
0855178832 
.0724741050 
.0618873278 
.0527550069 
1479546565 
1198752861 
. 1000346585 
.0848765182 
0725451434 
. 13896410804 
. 1167449805 
0991935960 


. 1371624335 | 


. 3236363870 
. 1602373762 
. 1089309641 
.0830686767 
.0670884464 
.0559933694 
.0476620974 
.0410208554 
.0354439060 
.0305012591 
.0257945392 
.0206221233 
. 1972646039 
. 1349020328 
- 1031959206 
.0835045822 
.0697859658 
.0594590652 
.0512113198 
.0442747124 
.0381191478 


.0322507340 || 


. 1579786877 














CENSORED SAMPLES 429 
TABLE I—Continued 
n | ‘ j Value n i | j | Value 
= | — | = a | —— 
2 | 3} | 3 i 3s] | 
| | 4 | -1212063211 | | 8 | .0589221432 
| | 5 | .0982605602 || | 9 | .0514460445 
| 6 | 0822228461 110 | .0449637542 
| 7 | .0701213964 | | | 11 | .0390643799 
| 8 | .0604384621 | 4| 4 | .1330111820 
| 9 | .0522825611 || 5 | .1082512667 
10 | .0450357615 || 6 | .0909855605 
4} 4 | 1398109405 | 7 | .0780173339 
5 | .1135687821 || 8 | .0677217143 
6 | .0951645279 || 9 | .0591628729 
7 | .0812419810 || 10 | .0517328050 
8 | .0700795832 || 5 | 5 | .1232503256 
9 | .0606620874 6 | .1037367701 
5 | 5 | .1306137359 7 | .0890434754 
6 | .1096212247 8 | .0773552864 
7 | .0936951520 9 | .0676230994 
8 | .0808972960 6| 6 | .1183175325 
| 6| 6 | .1266377911 7 | .1016824204 
7 | .1083945831 || 8 | .0884194610 
13 1} 1 | .3152053842 7| 7 | .116798995 
2 | .1557272904 || 14 1} 1 | .3077301026 
3 | .1058908842 || 2 | .1517203662 
4 | .0808649736 || 3 | .1031719531 
5 | .0654634499 4 | .0788715916 
6 0548221797 || 5 .0639657428 
7 | 0468833088 6 | .0537064714 
8 | .0406132548 7 | .0460899189 
9 | .0354226462 8 | .0401141688 
| | 10 | .0309322744 9 | .0352141760 
| | 11 | .0268537250 | 10 | .0310371163 
12 | .0228858068 | 11 | .0273362865 
| | 13 | .0184348220 | 12 | .0239061001 
| 2) 2 | .1904130721 | 13 | .0205080257 
3 | . 1302055829 | 14 | .0166279801 
| | 4 | .0997262696 2\ 2 | .1844200252 
| | 5 | .0808785938 3 | .1260791989 
| 6 | .0678145832 || 4 | .0966524633 
7 | .0580457285 | 5 | .0785202981 
8 | .0503167946 || 6 | .0660028340 
9 | .0439095087 || 7 .0566896715 
| 10 | .0383601798 || 8 | .0493708148 
| | 11 | .0333147765 | 9 | .0433617156 
| | 12 | .0284018130 || | 10 | .0382337404 
3| 3 | .1513917013 | 11 | .0336863221 
| | 4 | .1162698131 |} | | 12 | .0294681314 
| | 5 | .0944566603 || 13 | .0252863928 
| | 6 | .0792922993 || 3 | 3 | .1457045665 
Fo 4 .0679282354 || 4 | .1119816877 
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.0911181271 
.0766754957 
0659084825 
.0574341188 
.0504677802 
.0445169192 
.0392352316 
0343322071 | 
.1272273070 | 
.1036931108 
.0873562483 || 
.0751519909 
.0655310936 || 

| .0576120957 || 
.0508402240 || 
.0448243469 
.1171012461 

| .0987747550 

| 0850536546 || 

| .0742181416 || 
.065286777 
.0576401464 

| .1115324579 || 

.0961405595 || 

.0839617110 || 

.0739069221 | 

.1090269480 || 

0953087256 | 

.3010415703 

.1481297708 

.1007223449 

.0770594060 

.0625845851 

0526530129 

.0453078886 

.0395736673 

.0349035905 

—— 


275211039 
aman 


.0214819828 
.0185333263 
.0151137071 
.1791215291 
. 1224176953 
.0939067144 
.0763912337 
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.0643390895 
.0554074400 || 
.0484238833 | 


| 0427294113 


.0379177516 
.0337151721 


| .0299152347 


.0263303885 
.0227213594 
. 1407322502 


| 1082138452 


.0881605755 
.0743268436 
-0640558183 


| .0560136122 


.0494485109 
.0438960670 


| .0390426915 


.0346513382 
.0305060359 
. 1222328270 
.0997323941 
-0841705696 
.0725946869 
.0635175907 
-0560990511 


| .0498187836 


-0443247452 


| 0393501820 
| . 1118698986 


.0945206004 
.0815891122 
.0714331681 


.0631224388 


.0560795065 
.0499127743 
. 1058666366 


| 0914683204 
| .0801407559 
| .0708582099 


| 


.0629824402 
. 1026916923 
.0900499964 
.0796738323 
.1016946521 
. 2950098090 


1448881689 


.0985009764 
.0754040023 
| .0613086724 
.0516624963 


0445503705 
0390194716 


.0345378158 
.0307810093 


275353612 


.0246479007 
.0219956755 
.0194585037 
.0168710289 
.0138287378 
.1743940788 


1191409287 


.0914359918 
.0744591145 
.0628093909 
.0542033941 
-0475009769 
.0420638270 
.0375018250 
.0335574912 
.0300461298 
.0268189579 


0237301562 


.0205785433 


1363385612 
1048706756 
0855189036 


| .0722075087 
| .0623568515 
.0546749107 
| .0484366096 
.0431979377 
-0386652005 
.0346277256 
.0309149135 
.0273595378 
. 1178657554 
0962513413 
.0813480448 
0703000911 
.0616728990 


0546595026 
.0487647746 
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CENSORED SAMPLES 431 


TABLE I—Continued 


n i Value mn | 6 j Value n é j Value 
16 4 17 2 | 17 6 
ll .0436607328 i 9 .0413928192 || 12 .0478122599 
12 | .0391112669 | |10 | .0870349110 | | 7 | 7 | .0929031780 
13 | .0349253749 11 | .0332940892 || 8 | .0818194607 
5 | 5 | .1073517089 12 | .0299982825 || 9 | .0728154074 
6 | .0908232622 3 | .0270170379 10 | .0652667274 
7 0785480532 14 0242386812 11 | .0587626219 
S O6S9488802 15 0215459396 Ss 8 .0907361650 
9g 0611364182 16 .0187658306 | | 9 .0808000267 
10 0545638941 3| 3 | .1324207975 10 | .0724599963 
ll O-4S8684327 4 1018792434 gy 9 .0900465814 
12 0437882959 5 | .0831421716 || 18 | 1) 1 | .2845301297 
6| 6 1010461906 6 | .0702850403 2 | .1392501620 
; 0874627156 7 | .0607964413 3 | .0946172637 
8 0768239668 8 .0534208202 || 4 .0724851730 
9 0681545540 9 | .0474555487 | | 5 | .0590304274 
10 0608534805 10 .0424726884 | 6 .0498600635 
11 0545210724 11 | .0381925587 | 7 | .0431302310 
wi 7 0974026613 12 | .0344194567 | 8 | .0379260195 
8 | .0856181916 13 | .0310047771 || 9 | .0337388141 
9 0760015577 14 0278210708 10 | .0302610667 
10 0678931922 15 0247342095 11 | .0272938041 
8| 8 0957213007 4) 4 1140068197 || 12 | .0247002471 
9 .0850291218 5 0931620339 13 .0223801573 
an oe 2895330037 ti 0788266621 | 14 | .0202537421 
2 | .1419424629 7 | .0682298909 15 | .0182488619 
| 3 | 0964748737 8  .0599826092 16 | .0162850441 
4 | .0738849615 9 | .0533057575 17. | .0142368875 
5 | .0601272302 10 | .0477239973 || 18 | .0117719054 
6 0507326948 11 | .0429261816 2| 2 | .1662929294 
7 | .0438236491 12 | .0386942630 | 3 | .1135058132 
8 0384672834 13 .0348624030 || 4 0871597604 
9 0341441055 14 | .0312881041 5 | .0710825990 
10 | .0305389548 | 5] 5 | .1034004377 || 6 | .0600975754 
11 | .0274465527 | | 6 | .0875729930 || 7 | .0520217423 
12 | .0247237144 7 | .0758534534 8 | .0457683625 
13 | .0222620771 8 | .0667204245 9 | .0407317967 
14 | .0199690651 9 | .0593187706 || 10 | .0365451034 
15 | .0177476891 10 | .0531257771 || 11 | .0329704894 
16 .0154552071 11 .0477987292 12 .0298442464 
17 | .0127264751 12 | .0430970793 || 13 | .0270462261 
| 2] 2 | .1701426762 | | 13 | .0388375657 || 14 | .0244806359 
3 | .1161866734 | 6| 6 | .0968824669 | | 15 | .0220607111 
4 | .0891982557 | 7 | .0839811738 || 16 | .0196894667 
5 | .0726970385 | 8 | .0739130260 | 17 | .0172154925 
| .0613998459 | 9 | .0657442736 || 3 | 3 | .1288998943 
| 7 | .0530761573 || | 10 | .0589030403 || 4 | .0991828539 
| g | 0466140918 | 11 | .0530137275 } | 5 | .0809899792 
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TABLE I—Continued 


| - 2 | Value 
panei enete —| as - 
| | 18 | 1 
.0685324700 | 8 | 8 | .0864960639 || 8 | .0511541418 
| 0593598602 || 9 | .0771762286 || | 0456228816 
| .0522488413 .0693891332 | .0410365629 
| 0465162123 .0627116906 | .0371346427 
.0417473296 .0853127880 2 | .0337391171 
.0376730987 || 4 | .0767442321 3 | .0307215918 
| .0341080171 | .2799358050 .0279835020 
| .0309157650 | .1367768168 .0254424108 
| .0279875014 | | .0929061763 .0230195063 
.0252244786 | 0711902425 .0206214645 
.0225161109 | ,0580094835 .1074740839 
.1105660331 || | .0490405678 5 | .0879051965 
| .0903973787 || | .0424705246 3 | .0745033878 
.0765579277 | .0374006329 | 0646406188 
| 0663522086 | | 9 | .0333319395 8 | .0570032284 
| .0584310521 || | .0299634144 9 | .0508572608 
| .0520394281 || = .0271011338 | .0457576598 
| .0467183404 || .0246129452 .0414165091 
| .0421694861 3 | .0224037540 2 | .0376368753 
| .0381869632 |} .0204007370 { .0342765540 
| 0346192645 || 5 | .0185431530 .0312262549 
.0813452497 .0167731147 .0283944527 
.0282548286 | .0150223067 .0256935148 
.0999084321 || .0131789994 5 | 5 | .0967944745 
.0846879168 || 9 | .0109382527 5 §| 0821055695 
.0734460811 || 2| 2 | .1627856651 .0712796742 
| .0647101858 ‘ .1110590145 0628870095 
.0576543520 4 | .0852931053 .0561272025 
.0517756675 5 | .0695970759 .0505141639 
.0467468133 3} | .0588910196 .0457330144 
.0423415563 .0510351093 .0415681234 
| .0383932046 8 | .0449652247 3 | .0378636088 
.0347682770 9 | .0400891754 .0344995261 
.0932407331 | 0360490040 | 15 | .0313752928 
| .0809202644 .0326137544 | 6 .0900218693 
| 0713338046 2 | .0296258236 .0782029063 
| .0635829688 3 | .0269716592 .0690294360 
.0571197288 | .0245641909 | .0616336896 
| 0515868552 | 0223306885 .0554877905 
.0467370896 | .0202017247 .0502493169 
.0423879846 | 0180952193 | 0456834841 
0890167025 | 0158767294 .0416203596 
| .0785179677 ros 4 .1257138904 .0379290224 
.0700199026 .0967367097 .0856172981 
.0629269074 5 | .0790298792 | 8 | .0756153413 
.0568501034 | 6 | .0669273696 .0675433161 
| .0515199092 .0580336124 .0608297030 
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CENSORED SAMPLES 433 
TABLE I—Concluded 
n é } Value n i j Value n t j Value 
19 7 20 2 || 20 5 
1] 0551032224 | 1] 0322405467 || 7 | .0693175756 
12 0501089625 12 0293684960 || 8 | .0612251429 
3 0456621835 13 0268315105 || 9 | .0547222526 
8| 8 0828339961 14 0245479493 | 10 | .0493374275 
9 0740273546 15 0224526609 || 11 | .0447662310 
10 0666958229 16 0204888032 |) 12 | .0408014074 
11 0604372723 17 | .0185994024 13. | .0372948400 
12 0549752083 18 0167136502 14 | .0341351571 
gy 9 0812876330 19 0147107671 15 .0312332040 
10 0732703911 3 3 1228134687 16 .0285109200 
1] .0664202898 4 0945049010 6 6 .0871511254 
10 | 10 | .0807909751 5 0772355098 7 | .0757703360 
20 1| 1 | .2756966156 6 0654510179 8 | .0669555789 
2 1344941714 7 0568056677 || 9 | .0598659769 
3 | .0913234064 8 0501310269 | 10 | .0539910639 
4 | .0699879991 9 0447763202 11 | .0490008080 
5 | .0570566384 10 0403482354 12 | .0446702771 
6 .0482701093 11 0365934287 || 13 .0408385549 
7 | .0418437826 12 0333397949 || 14 | .0373845194 
8 | .0368937058 13 0304645792 | 15 | .0342111024 
9 | .0329296302 14 | .0278756579 7| 7 | .0826123955 
10 | .0296562523 15 | .0254994381 8 | .0730383676 
11 | .0268838808 | 16 0232716371 | 9 | .0653307665 
12 | .0244839567 || 17 0211277373 || 10 | .0589387428 
13. | .0223649803 18 | .0189874448 || 11 | .0535056766 
14 | .0204584277 4| 4 | .1046766243 || 12 | .0487882257 
15 | .0187096782 5 0856442356 | 13 | .0446121090 
16 | .0170711408 6 | .0726321560 ] 14 | .0408459989 
17 | .0154951854 | 7 | .0630731775 || 8 | 8 | .0796309757 
| 18 | .0139227072 | 8 | .0556855081 || 9 | .0712591607 
19 | .0122530117 || | 9 0497539273 || 10 | .0643103375 
2 | .0102047204 110 | .0448455403 || 11 | .0583997310 
2| 2 | .1595731636 11 | .0406811669 | 12 | .0532644495 
3 | .1088143707 12 | .0370709493 | 13. | .0487159834 
4 | .0835758044 3 | 0338793392 || 9| 9 | .0778118317 
5 .0682247554 14 .0310045146 |} 10 | .0702526464 
6 | .0577699656 15 | .0283650517 || 11 | .0638176734 
7 .0501109523 16 .0258897454 | 12 | .0582229133 
8 | .0442041191 17 | .0235070343 || 10 | 10 | .0769474356 
9 | .0394693443 5 | 5 | .0939960007 || 11 | .0699266198 
1} | 


10 .0355565554 6 .0797773755 
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sizes $10. Owing to the fact that the sum of the rows of the variance matrix 
of the order statistics (and by symmetry the sum of the columns) is equal for 
any sample size, the best linear systematic statistic of the population mean is, of 
course, the sample mean [5]. 

Later, Gupta [3] used Godwin’s tables of means, variances, and covariances 
to find the best linear systematic statistic of the population mean and standard 
deviation from singly censored samples. Although Gupta considered Godwin’s 
results as being a special case of his own, the more general case of estimation 
from doubly censored samples considered here includes the other two cases 
(r; or ro = O and rm, = rz = 0). 

tecently, Rosser [6], at the National Bureau of Standard’s Numerical Re- 
search Analysis in Los Angeles, tabulated up to nineteen decimal places the 
expected values of the standardized 7th order statistics X; , where y; = up + oX;. 
Following this, Teichroew [7], under the same sponsorship, calculated the ex- 
pected values of the product of the 7th and the jth order statistics in samples of 
sizes $20, drawn from a normal population. These valuable accomplishments 
make possible an advance in the study of systematic statistics. 

Using these tables, the variances and covariances of the order statistics in 
samples from a normal distribution up to ten decimal places were calculated. 
The variances and the covariances for order statistics of sample sizes up to 20, 
to ten decimal places are given in Table I. The missing entries may be obtained 
by 


Cov[X aX py] = Cov [X wig) X n—j4n]. 


In comparing the variances and covariances for n S$ 10 with those tabulated (to 
five decimals) by Godwin [1], it can be seen that several of the latter values are 
in error by more than one unit in the fifth decimal place, and one value Cov 
[X aX ao] when n = 10 is in error by as much as 5 units. 

Table II gives the coefficients a; and a; for the best linear systematic statis- 
tics of the mean and standard deviation, respectively, from. singly and doubly 
censored samples of sizes £10 drawn from a normal population. For the case 
when 7; = r2 = O, these tabulated results are more accurate than Godwin’s 
[2]. For the singly censored sample, i.e., r; or r, = 0, the tabulated results are 
more accurate than those of Gupta because the latter based his calculations 
on the earlier tabulations of Godwin. The tabulated values in Table II are 
correct except for the last place, which may be in error by one or two units. 

In comparing these tabulated values with those of the cited references, the 
results of Gupta [3] can be compared directly, whereas Godwin’s [2] values are 
obtained in terms of rank differences. Consequently, Godwin’s first coefficient 
should be exactly equivalent, except for sign, to that herein, and the remainder 
can be obtained by subtraction. 

If the coefficients of an estimate are sought for a value of r; not given in the 
table, these can be obtained by interchanging the values of 7; and r2 and rear- 
ranging the observations in descending order. In such an event, the coefficients 
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for the best linear systematic statistic of the mean will be identical with those 
given in the table, whereas those for the standard deviation will be numerically 
the same but with opposite sign. 

The variances of the estimates and their covariances are given in Table III 
in terms of o. These values (for r; = 0) may also be compared with those of 
Gupta. Their percentage efficiencies relative to the best linear systematic statis- 
tic based on the complete sample are tabulated in Table IV. 


3. Alternative estimates. Gupta suggested alternative linear estimates for the 
mean and standard deviation of a normal population to be used for samples of 
size >10 because of the tediousness of calculating the exact estimate and be- 
cause the variances and covariances of the order statistics for larger samples 
were unavailable. He calculated the variances and relative efficiencies of these 
alternative estimates for a sample of size 10. Gupta’s alternative linear estimates 
were intended for the special case of singly censored samples. In the present 
work, this has been extended to permit its use in the general case of doubly cen- 
sored samples. 

This alternative estimate is based on the assumption that the variance matrix 
of the order statistics is a unit matrix. Therefore, the alternative estimates for 
doubly censored samples with r; and rz missing observations will be 


n—T9 


(3.1) uw’ = Dd diye, 


t=<r +1 


where u*’ is the alternative estimate of the population mean, and 


n—T? 
<< x = 
(3.2) co = Zz Ci Yi)» 
t=r ,+1 


where o*’ is the alternative estimate of the population standard deviation. 


The values of b; and c; in y*’ and o*’ are determined by 





1 ti, (u; — ty) 
) = _ Ee — n—Te a a 
3.3 n— 1 — fe F # 
es) ’ > (uj; — &) 
=r,;+1 
and 
(u; — t,) 
Gg * '’ ——_, °° «©«© «OF 
3.4 , 2 
(84)  @&-&) 
jars +t 
where 
1 n—To 
ui = — —_—— U;, 
Te — 1) — Te jor\+1 


or the arithmetic mean of the expected values of the uncensored sample elements. 
The estimates (3.1) and (3.2) are unbiased estimates of the mean and stand- 


ee 
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TABLE IV 


Percentage efficiencies of estimates of the mean (u*) and standard deviation (c*) for censored 
samples relative to uncensored samples in a normal population up to size 10 





| 100.00 95. 86 8. 3.98 17.29 
100.00 77 58 .6 41. oD. 12.08 


100.00 9. ‘ 58. 35. 13.43 
100.00 ‘ ode 48. 34.4% . 6 10.16 
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TABLE IV—Concluded 


2 
n ri = 

0 1 2 3 4 5 6 7 8 
9 ( 2 68 .20 49.59 28 .83 10.77 


) | 100.00 97.11 91.52 82.2 
53 


1 
| 100.00 | 82.90! 67.69 | 84| 41.12] 29.40] 18.61 8.76 


1 95.18 90.74 $2.19 67.48 15.63 20.18 
66 .62 52.34 39.53 27 .94 17.48 8.13 
2 88.14 81.62 67 .07 38 .97 


39.11 27 .42 17.05 7.90 


16.93 7.82 | 


10 


) 100.00 97 .56 93 .03 85.72 74.85 60.09 42.27 23 .96 8.87 
100.00 84.62 70.85 58 .23 55 35.70 25 .62 16.27 | 7 


l 95.85 92.20 85.60 74.69 58.40 | 37.42 15.86 
69.88 56.83 45.00 .29 15.27 7.13 | 


w 
vee 
_ 
bt 

~ 


to 
% 


87 | 84.78 | 74.69 | 56.61 | 29.40 
44.58 | 33.62 | 23.74] 14.85) 6.93 


3 | 82.09 | 74.48 | 53.60 
23.59 | 14.69| 6.84 

| | 

4 | | 72.29 | 


| 6.81 


ard deviation, respectively. In order to compare these estimates with the best 
linear systematic statistics of the mean and standard deviation, the coefficients, 
variances, and relative efficiencies of the alternative estimates were calculated 
for all samples of size up to and including 10 and for all values of 7; and rm. 
The variances and relative efficiencies of the alternative estimate are provided 
only for samples of size 10 in Table V. The variance matrix of the estimates is 
obtained from (A’A)“A'VA(A'A)o’, where V is the variance matrix of the 
order statistics and A is the coefficient matrix when their means are expressed 
in terms of the parameters. The percentage efficiencies are relative to the cor- 
responding best linear systematic statistics 

Examination of Table V shows that for n = 10, and for all the values of 1; 
and r,, the alternative estimates are highly efficient and can replace the best 
linear systematic statistics without great loss of precision. In fact, use of these 
estimates for samples greater than 10 may be preferable to awaiting tabulation 
of coefficients for the most efficient estimates which involve the inversion of 
matrices of large order. The most efficient linear systematic statistics for the 
mean and standard deviation of a normal population from censored samples of 
sizes $20 will be given in a sequel to this paper (Part II). The authors feel, 
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however, that for samples of sizes > 20, the alternative estimate should be used 
because of its high relative efficiency to the exact estimate. 


4. Numerical example. Ten medical students were learning to measure sys- 
tolic blood pressure and practicing by taking readings upon each other. Poor 
technique was present in all observations but the skill of the observers in using 
the measuring device was limited initially to the central portion of the range. 
Owing to the relatively larger measurement error known to exist at the extremes, 
observations believed to be less than 105 mm. and greater than 125 mm. were 
censored. This practice resulted in censoring two observations on the left and 
three on the right in the sample of ten. 


The data, when arranged in an array, appear as the first column in the follow- 
ing tabulation: 


Coefficients 


Ordered observations 


Exact estimate Alternate estimative 
yu o* pe’ ao’ 
l 0 0 0 0 
2 0 0 0 0 
3. 108 20496319 — .88982266 09515275 — .79906860 
4. 111 . 1038253: — .11005067 . 15114637 — .37232645 
5. 119 . 11220127 — .02620385 . 20170682 01300816 
6. 121 . 11982080 .05494874 . 25071680 38652614 
7. 125 -45918942 97112842 .80127725 77186075 
8. 0 0 0 0 
9. 0 0 0 0 
10. — 0 0 0 0 
Estimate of parameter 118.9 16.61 119.1 17.17 


by linear systematic 
statistics 


If one assumes that the sample was drawn from a normal universe, the exact 
coefficients (a;; and a»;) for the best linear estimate, obtained from Table II for 
the case n 10, r; = 2, r2 = 3, are those shown in the second and third columns. 
If the alternative estimate is desired, the coefficients are obtainable from (3.3) 
and (3.4) in combination with a table for the values of «; . 

The exact and alternative estimates of the mean and standard deviation are 
provided. They are similar, as they should be according to the relative efficiencies 


1 The derivation of coefficients given in Table II for estimating the mean and standard 
deviation was based upon the assumption that censoring would occur by fixed percentages 
of the sample. The use of these same coefficients in the present example where censoring is 
performed according to fixed points on the abscissa raises a question of possible bias. Based 
upon the results in a sampling investigation, it is felt that this possible bias is probably not 
an appreciable one and of little concern in practical considerations. 
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given in Table V, viz., 97.95 percent for the mean and 96.88 percent for the 
standard deviation. 


5. Conclusions. Certain characteristic features can be gleaned from the tables 
presented herein. These are as follows: 

(1). Table II (Coefficients for the exact estimates) 

(a) By censoring a sample, the coefficients for the estimate of the mean and 
standard deviation undergoing maximum change are those which are associated 
with the extreme known sample elements. 

(b) For a fixed sample size and for r; fixed, as re increases the coefficient of the 
largest known element increases, whereas that of the smallest known element de- 
creases. 

(c) If the sample size is odd, and all the sample elements are censored except 
the middle one and its neighbor on either side, the central observation will have 
all the weight in estimating the mean (i.e., the other observation is of no value). 
If the sample size is even under the same circumstances of censoring, each middle 
observation has one-half of the weight in estimating the mean. 

(2). Table IV (Relative efficiency of exact estimates) 

(a) Comparing the relative efficiencies of the estimate for o with the corre- 
sponding efficiency for u in censored samples, the efficiency of the estimate of 
the former drops more rapidly than that of u. 

(b) Reading the entries for o* in diagonal fashion reveals that, for fixed n and 
fixed uncensored sample size (r; + re = constant), the efficiency of the ‘“‘best’’ 
estimate of o is remarkably constant independently of r; and r, . That is, it does 
not matter whether the missing observations are at one end or the other of the 
sample or divided in any way between the two ends. 

(c) Owing to the previous relationship, an interesting simple table is given 
showing how the efficiency in estimating o varies with the number of known 
values for each sample size. This is useful for practical work in censoring. 


Rough guide for assessing approximate efficiency (percent)* of estimates of o 


| 


| Number of uncensored observations in sample, or k = » — n — ra 
Sample size, » 


a . b @ 4 
— — — |__| 


| 


70 100 
50 75 100 
40 60 75 100 
| ¢ 35 45 60 80 100 
9 | } 7 30 40 50 65 80 100 
10 15 25 35 45 55 70 85 


* These values are only 2 or 3 percent off (or less) in almost all cases. 


(d) A simple fact is also evident for the estimate of the mean, yu. Its relative 
efficiency holds up—about 70 percent or better—so long as the sample median 
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value (the two middle values, if size n is even) remains known, no matter how 
many values below (or above) this one are missing. Thus, for n = 10, even if 
only the two values ys) and y@) are known, with the other 8 missing (7 = re 
4), the efficiency is still 72 percent. But if one of these mid-values, e.g., 6) , 
is lost, then all of the other values (to one side) ya) , ya), Ya), Ya, Yo) (i-€., 
r, = 0, re = 5) cannot make up for it. They produce an efficiency no better than 
60 percent. In other words, a single central value is worth more than half the sample 
in estimating the mean. 

(e) In estimating the standard deviation, the foregoing situation for the mean 
is the direct opposite. In fact, hardly any censoring is tolerable even in the most 
favorable case. Thus, for n = 10, it can be seen from the above rough table that 
for as little as two missing observations (r; + r2 = 2), whether at the same or 
opposite ends of the sample, the efficiency barely attains 70 percent. For more 
missing elements, the efficiency drops rapidly from a value under 60 to as little 
as 7 percent. 

(3). Table V. (Relative efficiency and variances of alternative estimates, 
n = 10). 

The alternative and the ‘‘best”’ estimate of either u or o are identical, as indi- 
cated by the 100 percent entry, when k = n — m, — 1m = 2, i.e., when only two 
of the sample values are known. This is because there are then only two coeffi- 
cients to be estimated, which makes the unbiased estimator unique and there- 
fore the same as the (unbiased) alternative estimator, which is therefore a for- 
tiori of minimum variance. The other case of identity is for a complete sample 
(r; = r2 = 0), where the best estimate of u is the sample mean, and this is also 
the alternative estimate. This is because the average, i, = Dt =1u;, of all the 
means of the sample order statistics is equal to the population mean, which may 
be taken as zero without loss of generality; by (3.3) this shows that all the coeffi- 
cients b; must be equal, giving the sample mean. 

The authors would like to thank the referee for his many valuable suggestions 
that have considerably improved the original version of the paper. 
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A WAITING LINE PROCESS OF MARKOV TYPE! 
By A. B. CLARKE 
University of Michigan 
Summary. Waiting-line or queuing processes of the Markov type are studied, 
the incoming traffic being of Poisson type and having negative-exponential 
holding time. The parameters are allowed to depend on time. The problem of 
finding an exact solution for the probability distribution of the waiting-line 
length as a function of time is reduced to the solution of an integral equation of 
the Volterra type. When the ratio of the parameters for the incoming and out- 
going traffic is constant, this equation can be solved explicitly and the required 
distribution obtained. Using this solution, the behavior of the process for large 
values of ¢ is studied, particularly for the unstable case with traffic intensity 21. 


Statement of the problem. We shall consider a Markov process n(/) taking 


values in the discrete space of nonnegative integers 0, 1, 2, --- , for which there 
exist nonnegative continuous functions A(t) and y(t) satisfying 


(i) for each m = 0, 1, 2, --- , Pr{n(¢ + At) — n(t) = 1|\n(t) = no} 

= d(t)At + o(Ad), 
(ii) for each mp = 1, 2,3, --- , Pr{n(¢ + At) — n(t) = —1|n(t) = no} 

= p(t)At + o(At), 

(iii) Pr{|n(t + At) — n(t) | > 1} = o(At). 

Intuitively, this states that the probability of an “arrival” to the ‘‘waiting- 
line’ during the time interval (¢, ¢ + At) is \(t)At + o(At), and the probability 
of a “departure” during this interval is u(t)At + o(At). Thus the system differs 
from a process which is simply the difference of two independent Poisson proc- 
esses (“arrivals to” and “departures from” the waiting line) only in that n(¢) 


is restricted to nonnegative values. 
Letting 


Pin = Pynlt) = Prin(t) = n|n(0) = v} (n, » = 0, 1,2, ---), 


> 


the basic “forward” set of Kolmogcrov equations for the system becomes 


(1) : 


d 
dt 


Pe = — (A(t) + u(t)) Prin + MOP. .n-1 + u(t) Py ns (n > 0), 


(2) = —X()P, ot+ uP, 


(see, fo. example, [4], p. 377). 


Received September 24, 1954. 
1 The research reported here was supported in part by the Office of Naval Research, 
Contract N6 ONR 232-1. 
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Our basic problem is the solution of this system of differential equations under 
the initial conditions 


(3) Py n(0) = 5y.n (n, v = 0,1, 2, ---). 

Reduction to an integral equation. By the results of [5] one is assured of the 
existence of a unique, nonnegative, continuously differentiable system of solu- 
tions satisfying > 25 P,.. = 1. 


On rescaling the time axis by introducing the new variable 
t 
r= | u(s) ds 
Jo 


in place of ¢t, it is seen that equations (1) and (2) transform into a new system 
in which y is identically equal to 1. This rescaling is possible provided y(t) has 
only discrete zeros. This assumption, while in no way essential to the results of 
this paper, will be made in order to simplify the details. 

Under these conditions, the system (1) through (3) becomes 


d 


(4) dr On = —[p(r) T iP... + p(r) Py nr + Py nti (n > 0), 
(5) «Pap. ghln + Pik, 
dr 
(6) P,,,(0) = Se (n, - 0, i, 2, ie +) 
where 
X(t) 
(r) = —; 
p(r) u(t)? 


this ratio represents, in the terminology of telephone waiting-line theory, the 
instantaneous relative traffic intensity of the process. 

By analogy, the quantity 

t 
A(s) ds a 
R(r) = ~——- = = | ole) de 
| u(s)ds ee 

might be termed the smoothed relative traffic intensity of the process. 

The system (4) through (6) can be simplified by introducing a new system of 
dependent variables: 


Q,.n(r) = fUtROlp, . (v,n = 0,1, 2, ---). 


In terms of these variables the system (4) through (6) becomes 


(7) £ Q,.n(r) -_ p(r)Qy,n-1(r) + Q,.n41(r) (n > 0), 
aT 
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d 


7 Q.0lr) = Qolr) + Q,.1(7), 
ar 


(9) Q,,n(0) = bn (v, _ 0, 1, 2, a -). 


In order to reduce this difference-differential equation (7) to a partial differen- 
tial equation, we introduce the generating function 


T) 


o (2 — 
Q,(z,7) = Dd Q,,n(r) - ; (v = 0,1,2,--- 


n == nm. 


ferentiation with respect to z and 7 gives, using (7), the following hyperbolic 
partial differential equation for Q,(z, r): 


This function will be analytic in z and continuously differentiable in 7. Dif- 


aQ, 


Or Oz 


(10) = p(r)Q,. 


The solution of such an equation (‘‘the Telegrapher’s equation’’) in general 


requires two boundary conditions. These are given by (8) and (9) which trans- 
form into 


aQ,(z, r) 


(11) 


OT 


and 
(12) 


In the method of solution to be used here, we first solve for Q,(z, 7) in terms 
of the (unknown) function 


(13) f(r) a aQ,(0, 7) 
Or 


using the classical Riemann method, with boundary conditions (12) and (13) 
(see, for example, [3], p. 316). The condition (11) is then used to derive an integral 
equation for f,(7). 

The Riemann function associated with (10) is easily seen to be 


‘ a \_1/ +) 1/2 
I p{2{(R(r)r — Rla)a\(z i FE 


where J,,(u) denotes the modified Bessel function 7 "J,(iu). Application of 
standard methods and integration formulas for Bessel functions ({11], p. 373) gives 
the solution 


(14) Q,(z,7) = A,(0,7, z) + Ao(c,7, 2) f,(c) do, 


“0 


where 
(15) A,(o, 7,2) = 2”"[R(r)r — R(o)o) 7, [2{[R(r)r — R(c)oJz}" *) 


7 


(n = 0, +1, +2, -->) 








ee 
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Noting that 


0 
= A,(o,7,z) = Analo,r,z), 
dz 


0 
(16) ar A,(o, T, z) = p(r) Anglo, T, 2); 


A,(0, 0, 0) -_ Sons 
A_,(r,7,7) = 0 for n > 9, Ao(r, T,7) = 1, 


one finds 


P sa eT FRO) a" Q(z, T) 
7 ‘ dz" z=T 


(17) : 
. en +R] | A-0(0,7,0) + I A_n(c, T, r)f(c) de | . 
0 


On substituting this into (11) using (16), one obtains the following Volterra- 
type integral equation for f,(r): 


(18) flr) = B,(0,2) + [ Bolo, *)filo) do, 
0 
where 
B,(0, r) = An(o, tr, rT) — p(r)Anailo, 7, 7). 


Consequently, (17) gives the solution to our problem, provided a solution 
f(r) to (18) can be found. In the important special case of p(r) = constant, 
(18) can be solved explicitly. In other cases it provides information as to the 
limiting behavior of f,(r). 


The case of constant traffic intensity. Let us now assume the relative traffic 
intensity to be constant, p(r) = p. Under these conditions R(r) = p, as well. 

Note that the three conditions: p(r) = constant, R(r) = constant, and 
p(r) = R(r) are all equivalent. 

Several methods are available for obtaining the explicit solution of (18). The 
one used here is possibly the simplest if not the most elegant. 


Let us now assume that f,(r) is representable by a power series 
wo 
. . 
(19) f(r) - Zz Gy kT s 
kad 


convergent for all values of +r. This can be proved directly using (7); however, 
this is not required. If a solution of (18) can be found in the form of such a power 
series, then the uniqueness property for the solutions of such an integral equation 
assures us that this series must be f,(r). 
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Using Sonine’s integral formula (see, for example, [11], p. 373), we have the 
following formula 


aT 


A_n»(o,7,7)[R(o)o}'p(o) do = k!A_n+-1(0,7,7), 
Jo 
forn = 0,k = O. In the case R(r) = p(r) = p, this becomes 
,—r-1(0, 7, 7). 
Since J_,(u) = I,(u), A_,(0, 7, 7) p'A,(0, r, r), this is also equivalent to 


(20) [ A_nlo,7, 7) o' do = k!p"Anyeai(0, 7, 7). 


Consequently, on substituting (19) into (14) and integrating, one finds 


Q,(z,7) = A,(0,7, 2) + DS ank! Ans, 7, 2. 
k=0 


Substituting this into (11) using (16), one obtains the identity 


A,(0,7,7) — pAvir(0, 7, 7) 
(21) = 
= a,9Ao(0,7,7) + > (a,4k! — area(k — 1)!)A,(0,7, 7). 


kal 


On equating coefficients of A;(0, 7, 7) (k = 0,1, 2, ---), one finds the fol- 
lowing recurrence relations for the a,, : 


ao = 0 for p> G, 


] . 
Oy, k k Gy k-1 » for0 <k <vy or v+1 


Ay »41 = ; : _ 
P (» + 1)! 


The solution of this system is found to be 
a, = 0, k <y, 


] 
Gs = —, 
Vv. 


1 cane 

( : p) aia 
k! 

whence 


@2) f= 5+0-) Y i=») G -2 


kenv+l fv. 
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In the special case » = 0 


’ 


fo(r) (1 — ple’ + p~. 


Substituting (22) into (17) and integrating term-by-term using (20), one finds 
the final solution 


Poa ™ gare | Ans(0,752) 


(23) 2 
+ p"Ans4i(0,7,7) + (1 — p)p” oe Ax(0,7,2) |. 


kean+v+2 


Using a table of Bessel functions, P,.,, can be tabulated for various values of 
7 and p from this formula. Some tables are available for the case v = 0 (see 
[2]). For other values of v, P,., can be found from the formula 


P,.. =p Pesan t e Ctr, n(O, 7, T) — p' Av4n(0, 7, 7)], 


which is easily derived from (23). 

Formulas essentially equivalent to (23) have been derived by Ledermann and 
Reuter [9] and by Bailey [1] for the case of constant \ and yu, using somewhat 
different methods. 


Limiting formulas for mean and variance. All the well-known limiting results 
for the case p = constant can be derived directly from (23). When p < 1, the 
probability distribution of n(t) approaches a geometric equilibrium distribution 
with common ratio p as r — ©, independent of v. When p 2 1, no such limit- 
ing distribution exists. (See [7] and [8] for precise statements of the results in this 
and in more general cases.) 

Let us temporarily drop the restriction that p be constant, and proceed to 
develop formulas for the mean M,(r) and the standard deviation o,(r) of the 
distribution. By definition, 


ca 


M,(r) = po | 


n=0 


«2 
oi(r) = Do [n — M,(r)’Pi.n - 
Assuming term-by-term differentiation to be justified (which can easily be 
proved), these series may be differentiated using (4), together with the fact that 


=P, = 1, to give 


$ M,(r) = p(r) —14+ Pio, 


dr 
a ai(s) = pls) + (ol) — DMAG)] — 1 + 2M.) AE 
ar . 
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whence 


(24) M.() = [RG@) — I + | Pro(o) do + », 
Jo 


(25) 62) = 2 [ lal) + bole) — 1A.(6)] do — My(e) — IMAG} + » +o 


If p(r) is bounded for all + = 0, then one sees from (24) and (25) that both 
M,(r) and o,(r) = O(r) ast 

In the case of constant p = 1, (24) and (25) will be used to determine more 
explicitly the limiting behavior of M,(r) and o,(r) as r > ©. (It is here assumed 
that r = fou(s) ds > © ast — ~.) 

Let us first assume that p = constant and that p > 1. Using standard integra- 
tion formulas for Bessel functions ({11], p. 386), one finds that 


—= + ] 
| &*PA,(0,7,1) dr = ———.. 
I p"(p — 1) 


When this formula is used to integrate (23) term-by-term, one obtains 


a . 


2 l = l 
eo ee ana OY an 
i I o(c) ado ; ) > p’t(p mn 1) + ( p) 2, p*(p os 1) 


Consequently, 


P,0(¢) do = — L + o(1), 
“0 p’(p _ 1) 


and, substituting this result into (24), 


(26) M(t) = (p — I) + ——, + » + oft). 
pp — 1) 


If (26) is substituted into (25), then it is easily shown that 
ot) = Vr(p + 1) + o(W/7). 
Let us now assume p = 1. In this case, from (23), 


Py o(r) = € "[I,(2r) + I,41(2r)] 


-yvalto()]; 


using an asymptotic formula for /,(z). Consequently 


(28) M,(r) = [ P,o(c) do +» = QV 1/ + O(1), 
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and 


(29) o(r) = V2r(1 — 2/x) + O(1). 
Note that neither (26) nor (27) reduces to (28) or (29) when p is set equal to 1. 


Waiting times. If T(r) is a random variable representing the time required 
to complete the servicing of an individual arriving at time 7, then 


T(r) = S[n(z)] 


where S[n] is a random variable independent of n(r) and represents the time re- 
quired for n + 1 transitions in a Poisson process having parameter 1; i.e., S{n| 
is the sum of n + 1 independent random variables each having the probability 
density function e", 0 < 7 < ~, and will thus have a Gamma distribution. 
Khintchine [6] and Volberg [10] have derived asymptotic formulas for the dis- 
tribution of T(r) as 7 — © for the case of constant p. Using the above formula, 
one sees that the probability density function for T(r) is 


wn n 


on Ss - 4-5 
y(s;7) =e "* >. Prin we letrO+01Q (5 + 7,7), 


% =O 


for s > 0. By using this result, these asymptotic formulas may be derived from 
the results of this paper. 


Note. I am indebted to the referee for the references to the papers of Volberg 
[10] and Ledermann and Reuter [9]. 
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ACCURATE SEQUENTIAL TESTS ON THE MEAN OF AN 
EXPONENTIAL DISTRIBUTION 


By G. E. ALBERT 
University of Tennessee 


0. Summary. In this paper, methods introduced earlier by the author [1] are 
used to obtain simple, accurate formulas for the decision boundaries for sequen- 
tial probability ratio tests for simple hypotheses and alternatives on the mean 
6 of the exponential distribution @—' exp(—u/@). Examples are provided to indi- 
cate the accuracy and the degree of complexity of the results. It is hoped that 
the results given here will find applications in life testing and statistical studies 
of radioactive decay. 


1. Some integral equations. Consider a sequential probability ratio test for a 
simple hypothesis 6, and alternative @. > 6, on the mean of the exponential 
distribution 


(1) g(u; 0) = 0 exp(—u/8), u> 0. 


The substitutions = 6”, Q(t) = log(—£), » = u and f(v; £) = (—é)exp(év) 
identify the present problem with a more general one studied in Sections 4 and 
6 of [1]. This identification will not be used here because it introduces needless 
complication of notation in the simple problem at hand. No confusion should 
arise from similarities or differences between the notations used in [1] and those 
used here. 

Define the parameters 


r = 02/0; , h = logr, 


The logarithm of the probability ratio for the test takes the form 


VA 


ise g(u; 2) _ mu _ 
~? glu; 6;) 6 


and its p.d.f. is 


f(z; m) = m™ exp[—(z + h)/ml], = —h, 
(2) 


= Q, es < mh. 


As in [1], Part II, let —b and a be the decision boundaries on the cumulative 
sums ty = ).--02; of z and let the starting point x» = zo of the test be chosen 
arbitrarily in the open interval (—b, a). When @ is the true value of the mean of 


Received March 14, 1955. 
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(1), the probability P(x» ; 6) of deciding in favor of the hypothesis 6, and the 
expected duration M,(zo ; @) of the test satisfy the integral equations 


—b a 
(3) P,(z; 6) = / f(y — x;m) dy + / P,(y; Of(y — x; m)dy, 
oo b 


(4) M30) =14+ | MY; Ofy — 2; m) dy 
6 


on the interval (—b, a). The kernel f(y — x; m) is obtained from (2). The prob- 
ability of deciding in favor of the alternative hypothesis 6 is given by P2(x; 6) = 
1 — P,(a; 4). 

The integral equations (3) and (4) can be solved exactly by a simple device 
that will be indicated in Section 3, below. The results are too unwieldy for 
practical use in the determination of decision boundaries for preassigned error 
risks. Approximate solutions for the integral equations will also be obtained. 
These are relatively easy to use and are demonstrated to be of sufficient accuracy 
to be considered essentially exact for practical purposes. 

It will be convenient to transform the integral equations slightly by introduc- 
ing the quantities 


(5) H =h/m, A = a/m, B = b/m, s = x/m, t = y/m. 


Account being taken of the discontinuity of the kernel, the equations (3) and 
(4) take the forms 


A 
P,(ms; 0) = 1 — e "7 + | Pi(mt; )e **" dt, -B<ss —B+H, 
: b 
(6) P 
= / P,(mt; =e" dt, —-B+Hsz=s8< A, 
s—H 


and 


A 
M,(ms,@) = 1+ | Mi(mt; ee **" at, —-B<ss 
-B 
(7) 
1+ [ Mi(mt; @)e **" dt, —-B+H-s8< 4. 
J s—H 


2. Approximate solutions of (6). Let n be any positive integer or zero with 


the restriction that —B + nH sS A and let X = X(6) be the non-zero solution 
of the equation 


1+rA =e”. 
Define the functions G,(s) and y¥,(s) by 


G(s) = — e'"(s + B — jH)’, 
j=0 Jj: 
€ Ms—-H) e™ 

y,(s)  _—— ark... Se 


6, — @ 
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where 6, is a constant to be determined. Let ¢,(s: @) denote a function defined 
on the interval (—B, A) by 


f? —C,e7"G,(e), —-B+(k-1)H Ses —B+KH, 


(8) ¢n(s; 0) = 
7 —B + nH s 8 <. A, 


where C,, is a constant to be determined. When n = 0, the first form on the right 
in (8) is to be deleted. Let ®,(s; 6) denote the iterate of the function (8) under 
the operator on the right in equation (6). 

It will be shown that if C, and 6, are any pair of constants related by the equa- 
tion 


(9) Cre"G,{ —B + (n+ 1)H} = 1—yn{-—B+ (n+ 1H}, 


then ¢,(s; 6) and its iterate ®,(s; @) are identical on the subintervals —B < 
s Ss —B+ mH and —B+ (n+ 1)H Ss <A of (—B, A). The identity does 
not persist to the subinterval -B + nH Ss S —B+ (n+ 1)H, but it will 
be shown that pairs of values 6,(0, U), C,(@, U) and 6,(6@, L), C,(6, L) exist for 
6, and C, such that (9) is satisfied and the resulting functions ¢,(s; 6, U) and 
¢n(s; 6, L) and their iterates ,(s; 6, U) and ®,(s; 6, L) have the properties 
(10) ®,(s; 6, U) S on(s; 6, U), #,(s; 6, L) = on(s; 8, L), 

—-B+nH sss —B+ (n4+ DH. 


It follows from Theorem 4 of [1] that ¢,(s; 6, U) and ¢,(s; 6, L) are respectively 
upper and lower bounds for the function P;(ms; 0) over the entire interval (—B, A). 
Specifically, the stated values of 6, are defined by the following: Let 
Q,(2; 6) = Onl =B + (nm + 1)Hj}e"*™* — G,{-—B + (n+ 1H — 2} 
nae G,.{—B + (n+ 1)H}e — G,{—B + (n+ 1)H — 2} 


Oszsd, 


then 


6,(0, U)e*?-"™ += min Q,(z; 6), Of2 H, 
(12) \ .—-\(B—nA) 
6,(6, L)e = max Q),(z; 6), O0Os2 H. 
The values C,(@, U) and C,(0@, L) are then defined by using (12) in (9). 
In general, the extrema required in (12) must be determined numerically. 
This is inconvenient. The value of 6, specified by 


(13) b, = 5,(0, C) = Qn(H; 0) ?-™ 


is relatively easy to calculate. It and its companion value C,(6; C), defined by 
(9) and (13), define an approximate solution ¢,(s; 6, C) for the equation (6) 
that is continuous on the whole interval (—B, A) and which lies between the 
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corresponding upper and lower bounds, at least on the important subinterval 
-B+nH sSs8<A. 

The proofs of the facts stated above are a matter of laborious detail that will 
be sketched next. The uninterested reader should proceed at once to the next 
section. The proofs of (10) will be divided into cases. 

Case (i): -B < s S —B + H. In this case the iterate of ¢,(s; 6) is given by 


n —B+kH 
®,(s; 0) ot me e*tt-# + > | gerry a Cit?" G,_.(0)} dt 


keel -B+(k—1)H 


. —t+e-H 
nlt 
é a ctrty () dt 


( n 
= 1 = eft?-# J aie e ee sa g** 
dl 


n 


— C, © 1Gl—B + (k + 1)H] — Gl—B + (k + 2)H]) 


kewl 
_ @ i oy ehBth—U+i)ae a _Mone \ 
7 1+x & 
This can be simplified by the use of 1 + \ = e™, the identities 


n 
—(k—1)H —kH —nH 
1 — dX le —e Jj=e”, 


n 


> {G[—B + (k + 1)H] — G.[—B+ (k + 2)H} =G,[—B + (n+ 1H] - 1, 
kal 
and the relation (9) to the form ,(s; 6) = 1 — Cye’*® ” = ¢,(s; 8). 

Cask (ii): —B + (M — 1)H Ss S —B + MH,1 <M §& n. In this case 
the iterate is given by 


—B+(M—1)H 


&,(s; 6) = | etter _ Cl ett®-Gy_o(t)} dt 
s-—H 


n o—B+kH 


+d etry _ Cyet®-"G._.(t)} de 


keoM Y—B+(k—1)H 


A i 
+ [ e **-* y(t) dt. 


B+nH 


Devices similar to those used in Case (i) reduce this to the form ¢,(s; #) = 
1 — C,e'** “Gy_1(s) appropriate to this interval. 
Case (iii): —B + (n+ 1)H Ss s < A. In this case 
A 
(14) (80) = [ valet" dt = Yals) 
s—H 


is almost immediate. The only reduction needed follows from 1 + \ = exp(AH). 





464 G. E. ALBERT 


Cask (iv): —B + nH < s S —B+ (n+ LH. In this case the iterate takes 
the form 


—B+nH 
—t 


A 
&,(s; 0) = / ete — Cnc? “Gy-(t)} dt + [ e '** "y,(t) dt. 
—B+ni 


s—H 


Add and subtract the integral over (s — H, —B + nH) of the quantity y,(¢) 
exp(—t + s — H) and use the formal identity (14) to reduce this to the form 
&,(8; 0) = Wa(s) + en(s; 8), 


en(8; 6) = {1 — va(s)} — (1 — val—B + (n + 1)MeP™ + Cyc" 


{G,[—B + (n + 1)H] — G,(s)}. 
The relation (9) may be used to eliminate the constant C, from ¢,(s; @) to obtain 
én(8; 0) = {1 — vals)} — {1 — val -—B + (n+ LA]etP 4 


(15) G(s) 
G.—B + (n + 1)H)’ 


—-B+nH Ss Ss —B+ (n+ IH. 


The inequalities (10) result from the requirements e,(s; 6) S 0 and e,(s; 6) = 0, 
respectively, enforced over the interval of definition of «,(s; 0). The definitions 
(12) are easily derived by setting s = —B + (n+ 1)H — z and assuming that 


6, 2 exp(—)A). This last assumption is always justifiable in practical cases. 


3. Exact solutions of (6). Exact solutions for the integral equation (6) may 
be found by a modification of the above-described technique. Omit the final 
form y,(s) in the definition of ¢,(s; @) and choose n large enough that 


—-B+nH 2A> —B+ (n — 1H. 


For example, if, for some integer L, A + B LH, choose n L. A relation 
comparable to (9) is found: 


C,e""G,[—B + (n + 1)H] = 1, A+B = LH. 


This determines C, .If A+ B = (L+ v»)H where L isan integer and0 < » < 1, 
a more complicated relation is found for the determination of C, . 

These exact results are almost useless for practical determinations of decision 
boundaries to effect desired risk probabilities. The formulas are so nearly inde- 
terminate that the writer obtained absurd results from them using modest 
computing facilities. In comparison, it will be shown in later sections that quite 
accurate determinations of decision boundaries may be made easily by use of the 
approximations ¢,(s; 6, C). 

The method of derivation of the function (8) may be of interest. From well- 
known theory, the integral equation (6) has a unique, continuous solution on 
—B < s < A. From the first form of the equation it is obvious that on the 
subinterval —B < s S —B + H, Pi(ms; 6) = 1 — C exp(s + B — H) for 
some choice of C’. Differentiation of the second form of the integral equation leads 
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to the differential-difference equation 


£ Px(ms; 6) — Pi(ms; 6) = —Pilm(s — H); 6), 
8 


valid over —B + H Ss < A. The change of form of P:(ms; @) from any inter- 
val —B + (k — 1)H S&S 8s S —B + KH to the next is easily determined from 
this equation and the continuity requirement. Finally, the form (8) is simply 
an expedient combination of the form for the exact solution and the function 
y,(s), which was studied in [1]. 

For sequential tests on the mean occurrence time of a Poisson process with a 
continuous time parameter, Dvoretzky, Kiefer, and Wolfowitz [3] found a 
differential-difference equation similar to that above and exact formulas for the 
operating characteristics of the tests that were of the same structure as the exact 
solution indicated above for the integral equation (6). Their discussion can be 
interpreted in a manner to apply to sequential tests on the mean of an exponen- 
tial population. Anscombe and Page [2] show how this is done and indicate 
another derivation of the quoted results of Dvoretzky, Kiefer, and Wolfowitz. 
These papers do not consider the problem of obtaining useful approximate 
results. 


4. Remarks on the approximation of P,(0; 62). As in Wald [6], it is usual to 
start a sequential probability ratio test of 6, versus 62 at x» = z = 0. The de- 
sign problem consists in the determination of the boundaries a and —b to achieve 
preassigned probabilities 


= P,(0; 61), 8B = P,(0; 62), 


of the first and second kinds of error. Easy success in this problem will depend 
on two things: (i) a choice of n in (8) small enough that the starting point s. = 0 
lies in the subinterval —B + nH S s < A of validity of the simple form y,(s) 
of ¢,(s; 6), and (ii) a choice of n large enough that the bounds ¢,(so ; 6, U) and 
¢,(80 ; 9, L) are close enough together to give the accuracy desired in the test. 
Explicit calculation of the values of the constants defined in (12) must be done 
numerically if n > 0. Sample calculations performed by the writer indicate that 
for a given set of values of a, b and so , the difference ¢,(so ; 0, U) — @n(so ; 0, L) 
decreases with increasing n or with decreasing r. The computational difficulty 
in obtaining needed values of 6, and C,, increases rapidly with n. It appears then 
that a good rule is to use the smallest value of n which will provide the accuracy 
desired. As a rough guide, the writer’s experience has been that the choice n = 2 
will usually give bounds for P:(0; 6.) that differ by less than one per cent; for 
the choice n = 3, the bounds usually differ by something less than one tenth of 
one per cent. Both of these estimates of accuracy are based on values of r be- 
tween 1.0 and 2. The choices n = 0, 1 are quite poor unless r is near unity. 
After a value for n has been chosen, either by the rough suggestions given 
above or by actual computation of the series of bounds for P;(0; 42), either the 
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upper or the lower bound or any value between them may be used as an approxi- 
mate formula. Since the extrema required in (12) must be calculated numerically, 
the continuous approximation given by use of (13) seems simpler to use. With n 
chosen so that the simple form y,(s) of ¢,(s; 62) is to be used at s = 0, it is clear 
that the value given by the continuous approximation will lie between the 
corresponding upper and lower bounds. 

The approximation ¢2(s; 6, C) is simple to calculate and is quite accurate. 
It is suggested here as a practical compromise approximation to be used in most 
designs. A brief table of data to facilitate the study of ¢o(s; 62, C) and the cor- 
responding bounds is given in Table 1. 


TABLE 1 
Data for the computation of P:(0; 62) 


m=m=r—i, H = Hz = (log r)/(r — 1) 
Ha | Q2(As ; 02)/r? log [Q2(Hs ; 62)/r*] 





05 .975804 | .376889 .06758 065393 
10 .953102 .385543 
15 .931747 .393865 
.20 .911605 .401879 .27682 .244376 
25 .89257 .409600 .34863 .299090 


-995039 .369709 1.01343 .013341 
1 
1 
1 
1 
1 | 
.874543 -417053 1.42142 .351658 
1 
1 
2 


.13626 127741 
.20602 .187324 


.40 .841180 -431202 56985 .450982 
.810930 -444444 .72191 .543433 
5 -746155 -474187 .11714 . 750023 
00 .693147 . 500000 .53198 . 929003 





ea 
1 
1 
1 
1 
1 
RK. 
1 
1. 
1 
2 


5. Approximation of P,(0; @,). Approximate formulas and bounds for P(x; 6;) 
are to be found from the identity P(x; 6:) = 1 — P(x; 6,) and the results given 
in Section 2 for P; . Clearly, P2(0; 6:) = 1 — $,(0; 6, ,C) and 1 — ¢,(0;@,,U) Ss 
P,(0; :) S 1 — ¢,(0; &, L). 

For the case @ = 6, , one finds that m = m, = 1 — (1/r),A = \y = (1/r) —- 1, 
and H = H, = rH,. It is easy to show that Q,(rz; 6:) = 1/Q,(z; 62). From this 
it follows that 


ef r\ —A1(B—nH;) —h2(B—nH) 
wa,.ce” “1A, ise.6hC 


, 


5,.(64 ; ier _ 1/5, (62 , ner, 


These relations give the values of 6, and C, needed for bounds on P2(z; 6;) in 
terms of those used for bounds on P(x; 6). Clearly, the same reciprocal rela- 
tionship may be used to obtain 6,(@, ; C) and the corresponding value C,,(@; ; C) 
for a continuous approximation to P2(z; 6;). 


6. Approximate decision boundaries. Page [5] shows a simple method for an 
improvement of Wald’s approximate formulas for setting the decision boundaries 
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and estimating the expected sample size of a sequential test. He illustrates his 
method for a normal population. Epstein and Sobel [4] give improvements over 
Wald’s formulas for the specific case of a semicontinuous sequential decision 
procedure on the mean of an exponential population. The latter authors study 
also the accuracy of their results by means of. upper and lower bounds on the 
operating characteristic and expected sample size for their specific test setup. 
This line of attack will be continued in the sections that follow here by using the 
continuous approximations for P:(0; 62) and P2(0; 6;) obtained above to derive 
a series of simple formulas of increasing accuracy for setting the decision bounda- 
ries a and —b to achieve preassigned risk probabilities a and 8. The correspond- 
ing upper and lower bounds for P,(0; 62) and P2(0; 6,) will be used to establish 
the accuracy of the formulas for a and —b. 

Assume that, for a chosen n, the boundaries will be such that —b + nh < 
0 < a. One then has the equations 
hoHe __ ents MA, eo 


1 é 
a — —_—_— — —— 
5n(O; -{)) — e141 


to be solved for a and b. By (13) and the remarks in Section 5, 


é 
16 ai hacia ii eitat aaron 
- Y 5a; C) — ee? 


bn(02;C) = &- "90, (Ha; 2) = A = 


? 


e'Q,(H2 ; 42) 
r” 


n —b 
OO) — aulsi-ney i a a 
5n(6: ; C) é / Qn(H2 ’ 62) Q,.(He ; 82) ’ 


where 


Q, (Hy; 0,) = Ont Ba + (n + 1)H2je"""* — G,{—Bz + nH3} 


G,{—Bz + (n + 1)H2je"* — G,{—Bz + nH} 
and 


AGHg | 
€ = 


1+r;, A%AAs:=(-1‘a, AB; = (-1'b, «= 1,2. 
The solution of (16) for a and —b is readily found to be 


a= log + — 8 — log r, 
x 


(17) 


— b = log — + log Q,(H2, 42) — (n + 1) logr. 


The assumptions made in deriving (17) are easily checked in any special case. 
Wald’s results [6] were the first terms on the right in (17). It is remarkable that 
so simple a modification of his results should suffice for the accuracy that will 
be indicated in Section 8. 

It might be noted that the results (16) and (17) could have been stated in terms 
of a general starting point 2» for the test if —b + nh < x < a. The effect would 
have been to replace a and —b by a — 2 and —b — 2», respectively. 
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A judgment of the accuracy of (17) in any given example is easily obtained by 
computing bounds upon the true values of P:(0; 62) and P.(0; 6) generated by 
the use of (17). One finds 


r"* - r” r” + s n 

—_ . —f 
= & PAD; 0) 5 ore 
(18 e* Ky; — rn eK, — yn 
) eK grt eK prt 

aa? ” 2~ 

seni & PMO; 6) = 

resK, — prt rerK, ay prt 

where 


K, = max Q,(2; 62), Ky = min Q,(2; 62). 


(0,H2) (0,H2) 


These bounds are to be evaluated by use of the results 


(19) eal Bt _ (1 - al ~ pyr 


ar aBQ, (He * 02) 
of (17) and a graph of Q,(z; 62). 


7. The ASN. It is convenient to transform the integral equation (7) as follows. 
The formal identity 


(1+A—de' 
may be written in the form 


—B 
1+4-s=1-H+]| adt+A-—-00°°" & 
s--H 


+ [ Q+A-—-de "dt, -—B<ss —B+H, 
-B 


A 
-1-H+]| (1+A-—de**™ dt, —-B+H<ss8 <A. 
s—H 


From this and the equations (6) and (7), it is seen that the function 
(20) R(ms; 6) = (H — 1)M,i(ms; 6) + 1+ A —s — (A + B)P,(ms; 0) 


satisfies the integral equation 
A 
R(ms;@) = H—-B—s+]| R(mt;0)e "dt, -—B<ss —B+H, 
—B 
(21) 


A 
\ .—t+s—H 
= [ R(mt; de dt, —-B+H-ss<A. 
“s—H 
Equation (21) for R is quite similar to equation (6) for P; and may be treated 
in much the same way. This will be indicated below. First, easily obtained bounds 
for R will be given. They are probably good enough for most practical purposes. 
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It is trivial to show that 


8+B—H 
Hl — « ) —-Bs3s5 —B +H. 


1 — ¢«# ‘ 


fm ee "“<H—B-—~s% 


IA 


It follows at once from Part (ii) of Theorem 5 in [1] that 


P,(ms; 0) S R(ms; 8) H P,(ms; 6) ; 


-_ be 


lA 


—-B<s<A. 


Bounds on the expected sample size are then readily obtained by use of (20). 

To obtain more accurate results, proceed in the manner of Sections 2 and 3. 
It is evident that on the subinterval —B < s S —B + H, R(ms; 6) = H — 
B — s + Dexp(s — H + B), where D is some constant. Over the remaining 
subinterval of (—B, A), R(ms; @) satisfies the same differential-difference equa- 
tion as does P,;(ms; @). An exact continuous solution to (21) may be obtained. 
Approximate solutions and bounds for the solution to (21) may be found by the 
technique indicated in Section 2. Since accuracy in the determination of the 
ASN is not of basic importance in the design of sequential experiments, the calcu- 
lations just indicated will be left to the reader who is interested. 


8. Examples. The following examples are based on the values a = 8 = 0.05 
and upon two choices of r,r = 1.1 andr = 1.5. They will serve to show the order 
of accuracy to be expected and the computation needed in the use of the decision 
boundary formulas (17). Six-digit accuracy appears to be useful in the computa- 
tions. 

EXAMPLE 1. r = 1.5 and n = 2. For this case one needs Q,(x; @.),0 Sz 
H, = 0.810930. From (11), it is easy to obtain 


e* — g(x) 


Q(z; 62) — , 
e— J2(X) ; 


(x) 1 ze *(1 — Hee "*) + 42° 
gr) = + — _ > - 
7 1 — 2Hye-*? + 4H} e282 , 


= 1 + 0.826045¢ + 0.2870052’. 


This yields the results: Qo(He ; 62) = 3.874306 and max Q,(zx; 6.) = 3.877046, 
min Q.(x; 62) = 3.855812. Formulas (17) then give a = 2.53898 and b = 2.80647. 
By the inequalities (18), the true values of P,(0; 6) and P.(0; 6:) satisfy 
0.04996 S P,(0; 6) S 0.05024 and 0.0499874 < P,(0; 6,:) S 0.0500017. 

As a comparison, the choice n = 1 gives a = 2.53898 and b = 2.78504 and 
the bounds on P;(0; 6.) are 0.05000 and 0.05166. For the choice n = 3, the 
bounds on P;(0; 6.) would be 0.049975 and 0.0500099. 

It is interesting to note that the choice of decision boundaries from Wald [6] 
would be a = b = 2.94444. For these values, the bounds (18) give 0.04428 s 
P,(0; 6) S 0.04452 and 0.03269 < P.(0; 6;:) < 0.03353. 

EXAMPLE 2.7 = 1.1 and n = 2. For this case H, = 0.953102 and Q.(H2 ; 62) = 
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1.374880, max Q2(x; 6.) = 1.375097 and min Q.(x; 6.) = 1.373430. These yield 
the decision boundaries a = 2.84913 and b = 2.91201 and the bounds 


0.049992 < P,(0; 6.) < 0.050052, 0.049997 < P,(0; 6) < 0.0500004. 


In this example, the choice n = 1 might be satisfactory. It yields a = 2.84913, 
b = 2.90703 with the bounds 0.5000 < P,(0; 6.) <= 0.05043. 
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ON THE NORMAL APPROXIMATION TO THE 
HYPERGEOMETRIC DISTRIBUTION’ 
By W. L. NicHoLson 
University of Illinois 

1. Summary. In this paper a new normal approximation to a sum of hyper- 
geometric terms is derived, which is a direct generalization of Feller’s normal 
approximation to the binomial distribution [2]. For intervals that are asymmetric 
with respect to the mean, or when the distribution is skewed, the new approxi- 
mation is a marked improvement over the classical procedure. 

The hypergeometric distribution is discussed in Section 2, along with the 
classical norming and the resulting approximation. Feller’s remarkable normal 
approximation for the related binomial distribution is given in Section 3 with 
an indication of how it can be extended to cover the hypergeometric case. The 
result of such an extension is presented in Theorem 2 of Section 4. This theorem 
gives upper and lower bounds on the hypergeometric sum and hence provides 
a useful estimate of the relative error. Preliminary results to proving Theorem 2 
are exhibited in Section 5. The proof follows in Section 6. 


2. Introduction. Let 7 be a finite population of N elements, D of which 
possess a specified characteristic S. In a random sample of size n(n S N), 
drawn without replacement from 7, the probability G, that exactly k of the n 
elements possess S is given by the hypergeometric function. Defining H,,, as 
the probability that k satisfies the inequality X<k<v,we have, symbolically, 


(?) @ - j 
k —k = 
(1) G = L535 EL, the = DG. 
N k= 
(1) 
The mean yu and the variance o} of the distribution (1) are given by 


D 2 n(N—n)DN—D 
(2) worn N and o,= ee: N Eee 


If N, D, n, and k increase without bound in such a manner that 


(3) » — limit, = — limit, and z, = (k — uo, — z, say, 
N N 
then (see [1], page 146), 


(4) Gy ~ (24) ose? 
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where the symbol ““~’’ means that the ratio of the two sides tends to one as 
the arguments increase. As a consequence of (4), approximations to G, and 
H,, of (1) for large N, D, n, and k are 


re ‘ -1/2 —1 —z2/2 
(5) m= "One * and »=® (=, 


respectively, where 


(6) &(z) = (2x) | ee? a. 
Leo 

Since (4) is an asymptotic result, we naturally are interested in the magnitude 
of the error involved when finite values of NV, D, n, and k are used. The maxi- 
mum error in the approximation (5) to (1) is O(o;'). For most values of zi 
(excluding only those values of z; that are close to zero), the contribution of 
the corresponding G, terms to the sum H),, is negligible in comparison to oj". 
Now, zi will be large if |k — y| is large or if DN is close to zero or one. Hence, 
for the cases of primary interest—an evaluation of the tail of the distribution 
(1) and an evaluation of (1) when a small percentage of the elements of + possess 
or do not possess S, as the case may be—the approximation (5) leaves much 
to be desired. 

The above two instances will tend to invalidate the fit of any normal ap- 
proximation to (1), since they constitute cases of extreme deviation from nor- 
mality. For an approximation to (1) to be useful, it should be accompanied by 
a concrete bound on the error involved, preferably, a bound on the relative 
error that would not be affected seriously by the above extreme cases. Such 
a bound, as a function of N, D, n, », and v, would explicitly tell in any given 
situation whether NV, D, and n were sufficiently large to give the desired accuracy. 

Approximations of the type in (5) that are functions of linear limits possess 
error terms which for the above extreme cases are at least O(o;') over a uniformly 
bounded interval, an interval which does not increase with o, . Outside of this 
interval, the error is even larger. In an attempt to improve on the approximation 
(5), we consider the case where the limits are quadratic polynomials. The 
impetus for such an approach is due to the remarkable result of Feller [2] for 
the related problem of normal approximations to the binomial distribution. 
Since our development depends heavily on that of Feller, we include his result 
in detail. 


3. Feller’s result. For fixed n and 0 < p < 1,q = 1 — p, Feller’s designation 
of the binomial distribution is 


T; (") pq, Py, = DT. 


k= 


{kK+4—(n+ 1)pjor and += (n+ 1) pq. 





“E sateen 
3 sees 
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io 
Replacing the orthodox norming, (k — np)(npq)'*, by a1, Feller derives an 
exponential expansion for 7, which lacks the troublesome square root factor 
present in the classical expansion about (k — np)(npqg)“”. (For a discussion of 


the classical procedure see [1], Chap. 7, Sec. 2.) Using the new expansion (see 
Theorem 3 of this paper), he obtains upper and lower bounds for P,,, as normal 
integrals with quadratic limits in 2, and 2.41, respectively. The unique 
feature of the approximation is that the gap between the two bounds remain 
O(o;') throughout an interval which increases with o, . Moreover, a useful 
upper bound on the relative error is provided. 

Let 6a, = p — gq. Feller’s normal approximation to the binomial distribution 
is contained in the following 

THEOREM 1. Suppose that 


(9) au>3 

and 

(10) h=S(n+1)p, v+4S (n+ 1)p + 201/38. 

Then, 

a Pye SOPOT (41) — O(m)} 

uf 

(12) a. k — (n + 1)p 4% fk — (n+ pl’ 4 2a, _ es 
o1 a1 \ o1 ) o1 20; 


whereas the inequality in (11) ts reversed if 


k—(n+1)p, am fke- (n+ Up), 2m, Mi, 1 


(13 = ; aie — « 
} * 01 oi \ 01 J a1 6c, * 70, 
where 
3 
(14) M, = na = {vy +4— (nt Lp}*or. 
1 


[It should be stressed that this approximation holds for all combinations of n 
and p for which (n + 1)pq > 9. A must only be larger than the central value, 
and vy smaller than a monotone increasing function of o,, which for o; = 3 is 
more than two standard units above the central value. An analogous result 
holds for (A, v) intervals to the left of the central value. The gap between the 
bounds is O(0;'), as long as ti.» = O(0;), which covers most cases of interest. 
teturning to the hypergeometric problem, we note that if N and D are large 
relative to n, sampling without replacement is closely approximated by sampling 
with replacement. In this case, (1) differs little from the binomial distribution 
(7), with p DN™. This suggests that Feller’s result could be generalized to 
the hypergeometric distribution if the ratio G,.7T; (with p = DN) could be 
written in a suitable manner as an exponential expansion of the type (31). In 
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Section 5 it is shown that by slightly altering the definition of p to p 
(D + 1)(N + 2)", the corresponding ratio G,.T; does have such an expansion. 
Multiplication by the Feller expansion for 7, gives an expansion of G, , which 
admits almost identical treatment as that used by Feller to approximate 7, . 


4. Normal approximation to hypergeometric. In order to simplify the nota- 
tion, we introduce several auxiliary functions. Let 

_D+1 
r N+2° 


n+ 1 
N + 2’ 


(15) qgq=1-pD; 


Thus, for large values of D, N, and n, p and s are approximately the proportion 
of elements in w that possess S and the sampling fraction, respectively. Set 
(16) a = 4(p — qg)(t — 8). 

For each value of k, define 

(17) x lk+4—-—(n-4 1)pjo, o (n + 1)pqt. 


The normal approximation for the hypergeometric distribution that is derived 
in this paper can now be stated in a form similar to that for the binomial distri- 
bution. The only changes are those due to the finite population. 
THEOREM 2. Suppose that 
(18) ¢>3 
and 
(19) tl), »+3s + 1)p - 
Then, 
N +1 
(20) A» S\> e"{P(n41) — &(m)} 
\N + 2, : 
where 


(21) R= 


5(1 — pg)(1 — st) 


360? 
and 


(22 ‘ k — (n+ 1)p 4 a ) k—(n4 1)p\? 4 2a a 
a k = a \ — — = % r— 


Oo og \ 0 j 


whereas the inequality in (20) is reversed if 


Nk — 


_k- (n + 1)p 4 a fk = (n + 1)p\" . 2a 


7 go \ o f o 


3 —4 
fyt+ts—(n-4 1)p}*o : 
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The remarks immediately following Theorem 1 pertain equally to Theorem 2 
if (n + 1)pgq, o1, and x,, are replaced by (n + 1)pgt, o, and x, , respectively. 

As a point estimate for H,,, , we suggest the right side of (20), with , defined 
by (22) plus (20°). Designating this estimate, the lower bound and the upper 
bound by A, L, and U, respectively, we obtain the following upper bound on 
the relative absolute error, 


(25) ; max [7 — L,U — ff). 


As an example of the increased accuracy afforded by the estimation procedure 
of Theorem 2 over that of (5), we consider the case N = 5000, D = 500, n = 500, 
\ = 51, and »y = 56. Then, o° > 40, which certainly satisfies (18). The correct 
value is Hs5 ~~ 0.30847. Theorem 2 gives the bounding interval as 
(0.30426, 0.31050) with the point estimate Al ~ 0.30770. By using (25), the 
upper bound on the relative error is 0.92 per cent (calculation shows it to be 
0.25 per cent). The classical procedure (5) estimates 0.31513 with a relative 
error of 2.16 per cent, about nine times as large as that for 77. 

We can not expect the discrepancy to always favor our new procedure to 
such a marked degree. The symmetric cases when p and s are close to one-half 
(i.e., when a is close to zero) serve to illustrate this. Here, the limits (22) and 
(23) of Theorem 2 are almost linear functions of x. We can expect the two 
estimation schemes to give essentially the same result, and there is no guarantee 
that the estimate of Theorem 2 will be better. As an example of the symmetric 
case, we consider a case of perfect symmetry, a = 0. Let N = 400, D = n = 200, 
X = 101, and vy = 105. o* > 25, which satisfies (18). The true value is H1o:.105 
0.32452. Theorem 2 gives the bounding interval (0.31832, 0.32822) with 0.32476 
as the point estimate. The bound on the relative error is 2.02 per cent, while 
the actual relative error is 0.07 per cent. The orthodox estimate (5) is 0.32426 
with a relative error of 0.08 per cent. While the two estimates do not differ 
significantly, we still have the added attraction of the bounding interval provided 
by the new procedure. 

As a rule of thumb, we suggest the use of Theorem 2 when the distribution (1) 
is skewed (i.e., when a is not close to zero). If only a point estimate is wanted for 
H,,,, the symmetric case can probably be handled just as effectively with the 
classical procedure (5). 


5. Hypergeometric expansion. The following two lemmas and Theorem 3 are 
due to Feller [2]. We state them here for the sake of completeness (for proofs, 
see [2]). In the process of approximating H)., , sums must be replaced by integrals 
of the normal type. Lemma 1 expresses the normal integral in a form that will 
be useful in this connection. 

Lemma 1. For 0 < h < 1 and |zh| < 14, 


z+h/2 
(26) / eo? du =h exp {—2°/2 + (x* — 1)h’/24 + oh}, 


—h/2 
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with 
(27 —z‘* /880 < w < 1/264. 


We have slightly relaxed Feller’s condition of |zh| < 1. As originally stated in 
[2], the lemma is not sufficient for our purpose. More care in handling Feller’s 
inequalities shows that (16) of [2] is valid for 0 S a S 0.7. The remainder of 
the proof is identical to that of [2]. A modified form of Stirling’s formula is 
provided by Lemma 2. This will be useful in expanding G, as an exponential 
series. 

Lemma 2. For n = 4, 

o —_ a 1\n+H1/2 pe: im 1 7 1 +o \ 
(28) mn! = (2r)'“(n + 3) exP 4 (n + 3) in +4) + 5880 in + 4) : 


/ 


‘ ' o\s_nti2 | l 1 + ge) 
(29) n! = (2r)"’n exp ~* + a aes fe 


where 
(30) ld;| < 3, $70 as n>, 
Feller’s exponential expansion of the binomial distribution (7) is given by 


the following 
TueorEM 3. If k = 4,n — k 2 4, and |x| < a1, 


’ —1/2 —1 
T; = (27) o1 


(31) 


where x; and a; are defined by (8), and 


Ai) RM a Se 
2880 |\(k +4)? ° {[(m+1)— (k+F/ * 360m + 1" 


(32) A = 


Here, as in the sequel, the subscript k on 2, will be omitted when there is no 
chance of confusion. 

In order to obtain an exponential expansion for the hypergeometric distribution 
(1) of the type (31), we consider the following norming. Let 


33) mee = {kK +3 — (n+ 1)p}or and -=(N—n+ 1)pq, 
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where p and q are defined by (15). To utilize Feller’s binomial expansion, we must 
express the non-binomial portion of G, as a suitable exponential series. Write 
(34) G, = TC; , 
with 
(D+ 1)1(N —D+1)\N—n+))! 
(D—k)"(N —- D—xn+k)!(N + 2)! 

(N + 1)(N + 2) 1 


“DW+F)N—-D+HDN—nFl per 


C; == 





(35) 


In the expansion of C; , we shall use (28) for the two factorials inv — k - 
(29) for the four not involving k. Then, if DD —-k =>4andN —D—n+k 


log C. = [((D + 1) + 4] log (D + 1) 
+ [((N — D+ 1) + 4J]log (N — D+ 1) 
+ ((N — n + 1) + 3] log (VN — n+ 1) 
— ((D + 1) — & + 4)] log ((D + 1) — (& + 4)] 
— ((N — D+ 1) — (n+ 1) + (K + 3)] log [((N — D + 1) 
(n + 1) + (k + 34)] 
((N + 2) + 4] log (N + 2) 


ee bs 1 as aie 1 ae 1 
10D+1) ° 10N—D+4+1)° 10(N—n+1) 12(N + 2) 


1 iy 1 / 
24((D+1)—(k+3)]  MWN-D+1I)—-@+)H+€ 


log (VN + 1) + log (N + 2) — log (D + 1) 
log (VN — D+ 1) — log (N — n+ 1) 





k log p — (n — k) logq — po, 


where 


Mees 





7 f l+¢ 1+ ¢1 } 
me (De c+ prt iwopt)—@+) + eFDE 


(37) a" 
1+ @& + 1 + ¢ ‘8 Se a ee 


+ 30 \D +p W- D+? Wont! WD’ 
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We now introduce the substitutions (15) and (33). Algebraic manipulation 
reduces (36) to 


log C, = 


en og Pq 

l2pq(N + 2) 120? 
+ log N + om 
(N + 2)!2(N — n + 1)! ee 
To expand (38) as an infinite series, we must impose the condition that |x2| < o 
The combination of the resulting series, in a manner analogous to that of Feller, 
gives 

TurorEeM 4./f D—k24,N —-D—n+k 2 4, and |x| < o2, 

N+1 Phe 


( 00 v—l 
ie ee ke a a 
C. (N + 2)2(N — n + 1)! ad | 2X v(v — 1) “ 


9 
2 


1 o0 of = Xe v—2 
+ gig - (-2"1 (2) 
“#105 3 o 


l 2 l—p 
it Pq Pq 


2402 12pq(N + 2) - ey . 


where p and q are defined by (15), x2 and oz by (33), and p2 by (37). 
Using (8), (15), (17), and (33), we can derive 


1 N+ 1 ] N 
) = é 1 7. eee ee - = 
(40 a2 3 2 ed (N + 2)'2(N — n+ 1)" o, 


Define 


(41) p=ptp.- 


Combine the expansions for 7; and C;, given in Theorems 3 and 4. Make the 
substitutions indicated by (40) and (41), to obtain the following 
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TueoreM 5. ]/fk 24,n-k2>4,D—k24,N —-D-—n+k 2 4, and 
ri <a, 


GC N+1 (Qe) 7/21 > [p" * — (-@) ‘Ve 
(== 21) a exp <— 
Tk N +2 ’ I t 3 viv — 1) 


2) 


> fp = (-9¢) "ir" - (-9)""4 (2) 


=<} 


l 
240? “3 


oO 


_1 + 2pq Pa l — pq 


Mot  ' 12pr(iN+2) PI’ 


where p, q, 8, and t are defined by (15), x and o by (17), and p by (32), (37), and 
(41). 

Except for the terms independent of x and the extra factor in each of the 
series, (42) has the same form as Feller’s expansion (31). In the next section we 
prove Theorem 2 by an argument almost identical to Fellers. Only minor changes 
are necessary to cover the differences in the two expansions. Feller’s notation is 
used and reference to his proof is indicated by “F”’’. 

6. Proof of theorem 2. Since we are interested in values of k which satisfy 
\ S k &S », hypothesis (19) implies (39 - F). Hypothesis (18) implies k 2 9, 
N—D-—-n+k29,n—k2 9(1 / pt — 3) —3,andD — k 2 9(1 / qs — 3) —- 
4. In most cases this is sufficient to satisfy the hypotheses of Theorem 5. To 
cover the possibility of either pt or gs being close to one, we have included the 
extra hypotheses n — k = 4 and D — k 2 4. In any case, the hypotheses of 
Theorem 5 are satisfied and the expansion (42) of G, is valid. By using (39 - F), 
the remainder, p, in (42) can be shown to satisfy 


1 
(43) O<p< [208 

The remaining portion of the proof is devoted to showing that the expansion 
(42), where p satisfies (43), can be written as a product of two factors—the first 
independent of k, and the second similar in form to the right side of (26) with 
argument x replaced by m , defined by either (22) or (23). If 7 is given by (22), 
we show that the second factor is less than the integral of (26) with z replaced 
by m , and if by (23), that it is greater than the corresponding integral of (26). 
The proof is completed by summing over all admissible k values. 

For each k, define & by (38 - F), where a is defined by (16). In the sequel 
we shall use the fact that |ja| < %. The subscript k on & and 2, will be suppressed 
when convenient. By (39 - F), we have 


(44) 2S t S427 fora> QO, and sx SEZZ fora < 0. 
Set 





20 ‘tie - eh e-l 6)?" v 
(45) s+ 8 Oe ee se 
i v(v — 1) 1 
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The first series of (42) is Ap. We write 
(46) 


where 


3 i 3 (2 s° 
(47) Sam E r g)tt +8) 


i 12 


The function within the brackets is defined in the (p, s) unit square with ab- 
solute maxima of +; at (1, 1) and (0, 0) and the unique absolute minimum of 
rez at (3, 3). 

We shall need bounds on A. First, consider the case a > 0. In this case, all 
terms of A; are positive; so, by (47) and (44), 
1 x if 


= ifa> 0. 


(48) ; 
- 192 o? - 300 o? 


If a < 0, As is an alternating series with the first term negative. Each negative 
term is smaller in absolute value than the preceding positive one. So, by (39-F) 
and (47), 


4 


7 12 2 30 a 


(vy A a° (P 4 3°) a ( 4 ZF a’) (t ws 3 x 
(49) Az E Sane Sh ee : 
The function within the brackets is defined in the (p, s) unit square and has 
its unique absolute min mum of yz at (4, 4). Using (44), 


4 4 
(50) > = 
=; 


~ 
> 


> 199 + = 19: ifa <0. 


The series A; can be majorized by a geometric series to give a uniform upper 
4, » 2 mr ° on e 
bound of 1.01z° / 160°. Thus, from (47) we obtain 


(51) 


Let 


(52) B; = 5, 2, — (-9" it - (-9"" 


] 
240? “GF 


The second series of (42) is B; . We write 


a 


toi 


(53) B; = &E+ B, 


where 


(54) i. 3|£ + @)(@ +8) _ a 
> 


12 2 
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Bounds on B are obtained in a similar manner to those obtained on A. An 
argument on (54) identical to that preceding (48), with A; replaced by B;, 
gives 


(55) B>1(1)Zs0 if a > 0. 


> \192/ ot 


Likewise, if A < 0, the argument preceding (49), with A, replaced by B;, 
applies to (54). Therefore, from (54), 

3 3 3 3 2 4 ae 4 A = 4 2 
(56) B= 5| ete se 915. 


2 12 2 18 


g 


The function within the brackets has the same absolute minimum as that of 
(49); so, (55) is also valid if a < 0. While a uniform upper bound on A is suffi- 
cient for our purpose, we must consider the two cases separately on B. First, 
let a < 0, then the series B; can be majorized by 2° / 1920*. Using (44) and the 
discussion following (47), we have from (54) 


1/5 1 |x’ c- .. 
[3(4) + la<t5 He <0. 


For a > 0, Bs is a positive term series which can be majorized by x / 12¢ 
Again, by (44) and the discussion following (47), we have from (54) 


r1/5 tae 
[5(3)+al5<eh ifa>0. 


Define Af, by (50-F). Then (51-F) and (52-F) follow. Substitution of (46), 
(53), and (52-F) into the expansion (42) gives 


(57) B 


IIA 


4 


(58) B 


IA 


Y N+1,, \- l a 
oO Reee” o exp ~3F + 7a! 
9 
(59) — d tog (1 + :) wih + oe he 
2 o 240° 
1 — pq \ 


12pq(N + 2) a s ; 

To eliminate the logarithm term, define C by (54-F). Expressing C as an in- 
finite series, we can bound C to obtain (55-F). Define y and Ay by (56-F), where 
u is a parameter to be determined. We note that y as a translation of & also 
satisfies (51-F). If we define u by (57-F) and define m by (22), then (58-F) is 
valid. Likewise, if we define u by (59-F) and define m by (23), the identities 
(58-F) are still valid. Because of (58-F), our Theorem 2 will be proved if we 
show that with u defined by (57-F), 

1 -N+1 2f \\ 


~ N+2 


bole 


bol = 
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and that the inequality in (60) is reversed if wu is defined by (59-F); R is de- 
fined by (21). 
Using (54-F) and (56-F), we can transform (59) to 


lr " N + l f \~—l/2ry 
(61) G, © e* (2x) ’F, , 
where 

: ts, Cr es | 
(62) F, Ay exp, — sy ot (y —1)+ 2£> 
and £ is given by (62-F). 

Let wu be defined by (57-F). Noting the form of (62) and Lemma 1, with h 
and x replaced by Ay and y, respectively, the inequality (60) will be proved if 
we show that (63-F) is satisfied. 

Substitution of the bounds (18), (43), (48), (50), (57), (58), and (55-F) into 
(62-F) gives, using (63-F), 


oF, < - oe i oe , ifa>od 


240 


2 2 1 
2 _ a P : 
ok; < - 2 192 § ifa < 0. 


We are interested in values of x which satisfy (39-F). For such values, § = 
107/216c. Elementary calculations show that the quartics in (63) and (64) are 
negative if § = 107/216c; therefore, (63-F) is true. This implies (60) is also 
true. Summing over all & values in the interval \ = k S »v, (58-F) and (60) 
give (20). Thus, the upper bound for Hy, is valid. 

A similar argument suffices to prove the lower bound is valid. Let u be de- 
fined by (59-F), then, as before, from (62) and Lemma 1, (60) with the reverse 
inequality will be proved if we show that 


\4 
_ (4y) . 


(65) E,=E 


= 0. 


264 
First, we need several auxiliary bounds. By (24), (44), and (54) we have 
3) 
3 f . 
200 


2h 
#" ifa> 0. 


A < ifa <0, 
(66) 


A-< 
l5e 


Also, (69-F) follows if a < 0, and 


= y 4at 11\’ 1 2a u\ 
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if a > 0. Substitution of the bounds (18), (43), (55) (which is independent of 
a), (55-F), (66), (69-F), and (67) into (62-F) gives (70-F) if a < 0, and 


B, > uw as Z) a 
20° 60? / 1804 300% 


u 2M 1 u _ 
+(-2-F - aot e)t- peat 


o l5e 1803 60 12¢ 


(68) 


if a > 0. Bounding the constant term in (70-F) and (68) (constant with respect 
to &) and evaluating the coefficient of — by means of (59-F), we obtain 


‘fy 6 1 293 (£) 1 ty 
9 BE. -__—-— a — - — = _ 
(69) >= — F070? 50st * 268\c) ~ 24 (é 
if a < 0, and 
ea ] 1 152 /¢ 1 /tY 
(70) Bi -o:- t- = ) — — f) 
. >= ~ 5000 3004 1134 (= 12 (: 


if a > 0. The right sides of (69) and (70) are parabolas in — opening downward. 
To show non-negativity, we need only to check at the endpoints of the £ in- 
tervals which correspond to (39-F). These are 


107 20 ., 

a160 ~§ <3 ifa <0, 
(71) 

Segue: tore 

“ ad 


Making the above substitutions, the right sides of (69) and (70) are seen to be 
positive. Hence, (65) is true; therefore, the lower bound, (60) with inequality 
reversed, is also valid. As before, summing over all admissible k values gives 
(20) with the inequality reversed. Q.E.D. 
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CHARTS OF THE POWER OF THE F-TEST'! 
BY MArtTIN Fox 
University of California, Berkeley 
1, Introduction. This paper presents charts of the power of the F-test designed 
to simplify entry and interpolation. The curves on which the quantity ¢ is 
constant are given for fixed level of significance a and power 8. The coordinates 
are f, and f., the number of degrees of freedom in the numerator and denomi- 
nator, respectively, of the F-statistic. Charts are presented for 6 0.5, 0.7, 
0.8, 0.9 both for a = 0.01 and a = 0.05 (Figs. 1 to 8). In addition, nomograms 
are presented for a = 0.01, 0.05 (Figs. 9 and 10) which make interpolation in 8 


possible. The latter charts give linear approximations to the curves on which ¢ 
is constant. 


The quantity ¢ is defined asV S? ‘((f: + 1)o’], where Sf is the value of S} 
when the observable random variables are replaced by their expectations under 
the alternative hypothesis considered, and S; is the sum of squares in the 
numerator of the F-statistic. 

With these charts the following question may be answered: What experi- 
mental setup is required (what combination of f; and fz), in order to obtain a specified 
power B against a given alternative? 

Tables of the power of the F-test have been given in two forms. Lehmer [2] 
tabled ¢ for fixed a, 8, fi, and f.. On the other hand, Tang [4] tabled Py = 
1 — £@ for fixed a, ¢, f; , and f.. Essentially the same information as in Tang’s 
tables was given, in graphical form, by Pearson and Hartley [3]. However, 
neither of these forms is always convenient for the design of experiments where 
a relation between f; and f2 is desired for fixed a, 8 for a specified alternative 
hypothesis. 


2. Construction of the charts. The present charts were constructed by inter- 
polation, both numerical and graphical, in the existing tables. For 8 = 0.5 and 
0.9, Tang’s tables were used; while for 8 0.7 and 0.8, Lehmer’s tables were 
found convenient. 

Lehmer remarks that in her tables harmonic interpolation in both f; and fe 
is very efficient. For this reason reciprocal scales were used for f; and f.. On 
this scale the curves of constant ¢ obtained from Lehmer’s tables are nearly 
straight lines (see Figs. 2, 3, 6, and 7). This is especially striking for large fi 
and fo. 

Tang’s tables give no entries for f; > 8. However, formula (13) of Lehmer 
may be used to compute ¢ for f, = ©, while the case f, = © is covered by the 
table of Fix [1]. As noted above, replacing the curves by straight lines for large 
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values of f; results in very small errors. Therefore, to continue the curves beyond 
fi: = 8 on the charts obtained from Tang’s tables, straight lines were used 
between f; = 8 and fi = ~. 

Since for practical purposes the curves for constant ¢ may be replaced by 
straight lines, it is sufficient to provide two points for each value of ¢ with 
fixed a and 8. Thus, on the nomograms the curves of constant ¢ may be re- 
constructed by connecting corresponding points in the two grids. For example, 
if the curve for ¢@ = 1.6 with a = 0.05 and 8 = 08 is desired, it may be obtained 
by connecting the two intersections of the curves for ¢ = 1.6 with the curves 
for 8 = 0.8 on Fig. 10. When this connecting line is extended, it intersects the 
vertical line f; = 4 at the horizontal line f. = 75. Also this extended line intersects 
fi = 5 at fe = 30, f; = 60 at fe = 10, etc., where the f. values are always rounded 
to the next larger integer. Thus, the power 8 = 0.8 can be achieved when ¢ = 
1.6 with any of these pairs of values of f; and fe . 

The nomograms have the disadvantage of restricting the range of values 
of ¢. 

On the nomograms, the curves for 8 = 0.6 were added by linear interpolation 
along the curves for ¢. Intersections for 8 = 0.55, 0.65, 0.75, 0.85 were obtained 
in the same way. 


3. Interpolation. For values of ¢ intermediate to those given in Figs. 1 to 8 
linear interpolation along the normals to the curves may be used. On the nomo- 
grams linear interpolation along the 8 curves may be used for intermediate 
values of ¢, and vice versa. 


4. Example. As an illustration of the use of the charts, we consider the design 
of an experiment to test for possible effects of geographic locality on electro- 
dermal resistance in 10-year-old children. We shall test children from k = 6 
cities. Let the hypothesis to be tested at the 5 per cent significance level be 
that the locality effects are zero. Suppose we want a reasonable chance £ of 
detecting that the locality effects are not zero when they are really 
8;,i = 1,---,k, where >-} 4; = 0. In particular, suppose that when 5-8i/o" = 2, 
that is, when the sum of squares of locality effects in units of the standard 
deviation o of a single measurement is 2, we want the probability that we 
conclude that the locality effects are not zero to be at least 8 = 0.8. What 
number n of children must be tested in each city to achieve this power? 

In this case, f; = k — 1 = 5and f, = k(n — 1) = 6(n — 1). Furthermore, 


¢= VS/(f + Del] = Vn > 8/(ko’). 


A procedure for determining n is the following: 

(a) We assume a trial value of n. When one of Figs. 1 to 8 is to be used, we 
may obtain this trial value by reading the value of ¢ for the curve meeting 
f, = © at our value of f; and then solving for n in the relation ¢ = Vn> 8:/(ko") 
using the next larger integer. (In this case we read ¢ = 1.46. Solving for n we 
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obtain n = ¢ko'/>. 8; = 6(1.46)°/2 = 6.39. Thus, we use n 7 as our first 
trial value.) 

(b) We fix 7 8;/o° at the value for which it is desired that the power be 8. 
(In this case a 33/o° = 2.) 

(c) We compute ¢ and f,. (In this case ¢ = +/7(2)/6 = 1.527 and f, 
6(6) = 36.) 

(d) Turning to the chart appropriate to our a and 8, we find the intersection 
of the curve for the value of ¢ in (c) with the line for the value of f, . (In this 
case we use Fig. 7 and find the intersection of the curve for ¢ = 1.527 with 
the line f,; = 5. This is at fe = 60.) 

(e) We repeat steps (a) through (d) until we have two consecutive values of 
n such that for one the value of f. obtained in (e) is larger than that obtained 
in (d) and for the other it is smaller. The larger of these two consecutive values 
of n is the required value. 

The following table summarizes the results of this procedure for our example: 


Trial # (2), fe = 6(n—1) Js from Chart 


1.527 36 60 
ai 


27 
633 | 42 23 


Thus, we require n = 8. 

Suppose we require a more stringent design. For example, suppose that with 
a, k, and >- 8;/o" as before we wish 8 = 0.85. Since interpolation in @ is neces- 
sary, Fig. 10 must be used. Otherwise the procedure is the same. In this case 
the above table becomes 


o= V/n(2)/6 fs from Chart 


Here we obtained the line for trial value n = 10 by connecting with a ruler 
the interpolated point for 8 = 0.85, @¢ = 1.825 on the left grid of Fig. 10 with 
the interpolated point on the right grid. Reading horizontally from the inter- 
section of this line extended to the line f, = 5, we found f. = 17. 

Since for trial value n = 9 the computed f: is larger than f, from the chart, 
while for trial value n = 8 the computed f, is smaller than f. from the chart, 
we require n = 9. 


5. Acknowledgments. The author wishes to express his gratitude to Prof. J. 
L. Hodges, Jr., for suggesting the form of the charts, to Prof. Elizabeth L. Scott 
for her help and encouragement, and to Mrs. Ruth Dubroff for her excellent 
drawing of the charts. 
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ESTIMATING THE PARAMETERS OF A TRUNCATED 
GAMMA DISTRIBUTION! 


By Dovetas G. CHAPMAN 
University of Washington 


1. Summary. A table is given to simplify the estimation of the parameters 
of an incomplete gamma or Type III distribution. A new procedure is also 
suggested for estimating the parameters of a truncated gamma distribution. 
This method is also applicable to a number of other truncated distributions, 
whether the truncation is in the tails or the center of the distribution. 


2. Introduction. Several examples have been given recently, employing the 
incomplete gamma or Type III distribution in fitting rainfall data; for instance, 
see [1, 2]. In an animal population study [3], it was found that the migration 
pattern could be fitted by this type of distribution. Frequently in such migration 
studies the data are truncated, that is, observations begin after migration has 
commenced or conclude before it has stopped. 

The parameters of the gamma distribution are often estimated by the method 
of moments in such cases (for example, see [4], pp. 121, 125), despite the fact 
that Fisher [5] showed the method to be inefficient. To facilitate solution of the 
maximum likelihood equations for estimation of the parameters in the un- 
truncated case, a simple table is given. 

The estimation of the parameters of a truncated gamma distribution by the 
method of moments has been studied by Cohen [6]. Since the integral of the 
probability density cannot be expressed in closed form, even the moment 
estimates are tedious to obtain; no attempt has been made to evaluate their 
variances or to study their efficiencies. After this paper was completed, a new 
study of the problem was published by Des Raj [7]. He gives the maximum 
likelihood equations for a number of cases of truncated and censored samples, 
mainly, however, under the assumption that the third standard moment is 
known. These equations can be solved only by iterative methods. In this paper 
a new method of estimation of these parameters is introduced which is easier to 
apply. The asymptotic variance-covariance matrix ef the estimates is de- 
termined. 


3. Estimation with origin known. The density function of the gamma distri- 
bution may be written in the form 


_ te b—1 
ez — ¢) 
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The parameters are frequently transformed so that the distribution is expressed 
as a function of the mean, variance, and skewness. Since the corresponding 
sample quantities do not efficiently estimate the parameters, such a trans- 
formation appears to be misleading. 

The maximum likelihood equations, based on a sample of n observations, 
have been given by Fisher [5]: 


(1) OS eo ee 
n 0a a 
(2) 1 pes (Seb Divwe 6 @6 
n db T(d) N im 
- 1 aL (b-—1)< 1 ) 
(: -—- —— = _— - - - = . 
3) n Oc ‘ n 2d (= a) . 


Since the parameter c determines the region of positive density, care must 
be exercised in obtaining its maximum likelihood estimate. If b > 1, then it is 
easy to verify that equation (3) gives the maximum likelihood estimate of c; 
if, however, b S 1, this is no longer true. In this case f(x) is monotone decreas- 
ing for x = c, and z = min; z; is the maximum likelihood estimate of c. 

We consider first the case where the origin is known, so that c may be set 
equal to zero without loss of generality and equation (3) drops out. Letting 


z,=1>mzx, 


TL i=l 


(1) and (2) yield 


y(b) = Inb—- ies =mnZi— i. 

+ ee , " 

Since Tb)’ the digamma function, has been tabulated by Gauss [8] and 
by Pairman [9], it is easy to construct a table of 7(b) and solve for b by in- 
verse interpolation. A small tabulation of y(b) is given in Table I; a more 
complete tabulation of ~(b) is available in mimeographed form from the Lab- 
oratory of Statistical Research, University of Washington. There y(b) and its 
first and second differences are tabulated for b = 0.01(0.01)2, 2(0.02)5, 5(0.1)20, 
20(1)100. The table was checked by summing columns in the basic tables and 
should be correct to one figure in the fifth decimal. 


4. Estimation in the truncated case with known origin. The density function 
is now written 


(4) f(x) = gs 0 


IA 
_&8 
IA 


T 


ll 


0 elsewhere, 
where 


T 
K(a,b) = / ey” de. 
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TABLE I 
b = y‘(a), where y(b) = Inb — I'’(b)/T'(b) may be used to estimate 
the parameter b in the density 


= fa’ /T(b)|e "2"; i.e., b = y [In # — Ing] 


Third Decimal of a 


‘a 6 


5 
/ 





| 

.84 | 38.63 | 35.88 | 33.51 | 31.42 | 29.59 | 27. 
| 21.91 | 21.00 | 20. 40 | 18.68 | 18. 
.79 | 15.32 | 14.87 | 14. 05 | 13.68 | 13. 
07 .79 | 11.53 | 11 .03 | 10.80 

78 | 9.60] 9.42] 9.25| 9.09] 8.94 

.98 | 7.86 | 7.74] 7.63 | 
11} 7.01 | 6.92| 6.83 | 6.74] 6.66 

| 6. 11 | 6.04 98} 5.91 

60 | 5.8 5.48 37 | 5.32 | 


Ioonan = 


a 


or 


5.42 | 5 


Second and Third Decimals of a 


2 | 30 40 so |) (60 70 
s | = 45 a» | 2 Tt 2 
28} 3.10 
19 | 3.01 | 
07 | 2.00 
2.04| 1.97 


73 3 
3 
) 
2 
1.53 | 1.50 
1 
1 
l 


61 
24 
19 
.62 
.60 
.28 | 


.O1 | 
.86 
33 | 
.28 
.66 | 
.64 
30 | 


4.33 | 
4.16 | 
2.43 
.38 
a | 


.69 | 


~ oo 
on 
—_ 
now 


bo 
gr 
> 


a 


.52 1.48 
23} 1.20 | 


.22 1.19 | 


os > 
wm & Go 
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“ee re DOD Ww KW 
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Second Decimal of a 





0.952 


0.832) 


0.740 
0.670 


1.10 | 
0.938 
0.822 
0.732 
0.662 


1.08 | 


1.03 | 1.01 


0.925 
0.812 
0.725 


0.656 


0.887 
0.783 
0.702) 


0.876 
0.774 
0.695 
0.632 


0.638 


The maximum likelihood functions now involve derivatives of K with respect 
to a and b, respectively; a double-entry table would be necessary to obtain the 
maximum likelihood estimates of a and b, and even this would involve double 
inverse interpolation. 

In lieu of this, another method of estimation is proposed. Let the n observa- 


tions be grouped by classes (&; — hy, &; + hi) (¢ = 1, 2, --- r), where & — hy 





sae 
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0,& +he = T,& + hi = btu. — hign, t = 1, 2,--- 7 — 1. Denote by »; the 
number of observations falling in class i, i.e., between §; — hy and & + hy. 
Define 


Eiths 
p;i = K” etx" dx = Ke ** (2h, 
(5) gin; 
Vi 
qa = - 
nm 
Now 
In ps — In pigs = (Eins — &) + (6 — 1) In gi + In hs 
(6) Ei41 hiss 


{= 1,2,---r—1 


to the degree of approximation indicated by (5). 

The form of equation (6) suggests estimating a and b by a least-squares 
procedure, with q; replacing p;. This can be justified as an approximate pro- 
cedure by the following results. To terms of order 1/n, 


y = — l 1 _Pi 
(7) E(In q,) = In py ae 
iy — 2 = a Pi 
(8) E(in qi — In p,) oe 


(9) E | ( In s) (in #)] = =<, i xj. 
\ Pi Pi n 


These results can be obtained by expanding In(q; / p;) = In(1 + (qi — pa) / ps) 
in a Taylor series (assuming that Pr(g; = 0) and Pr(q; > 2p;) may be neglected 
for large n). 

To show that the higher-order terms of the series expansion may be neglected, 
the following results are needed: 


7 2e+ 1 
E(q — p) ‘= 0 (4a): 


7 2s 1 


These may be proven by induction, making use of the recurrence formula 
for the central moments of the binomial (and hence also of the multinomial) 
distribution. This recurrence formula is 


du, 
Mest = PQ (ns. + me) ’ 


where yu, is the sth central moment. 
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Moreover, the limiting distribution of the In q; is easily obtained from the 
following lemma: 
y(n) > . . 
Lemma. Let {X$”} (¢ = 1, 2,---, r) be a@ sequence of random variables and 


ui, a(t = 1, 2,--- , r) be constants such that the joint distribution of 


y(n (xs? — + = . 5 
yp. G0 -w (¢ = 1,2, ---,7r) 


tends to the limiting distribution F(y., y2, ++: , Yr) as n — ©, and let f(x) be 
of class C™ in the neighborhood of (u:, ua, -** , Mr) with 


Then ZS” = V/nlf(X$”) — f(ud] / (oi-ts) (@ = 1,2, --- , r) have the same joint 
limiting distribution. 

This is a consequence of the general theorems on stochastic limit relationships 
proved by Mann and Wald [10] (see their Theorems 3 and 5; however, a trivial 
modification is required, since our f(z) is a function of a single real variable, 
whereas their corresponding g(x) is a function of a vector-valued random vari- 
able). 

Finally, writing 


(10) ys = Ing: — In qizt @=1,2,---,r—1), 


it follows that the y; are asymptotically multinormal with means 


gi + In hs 


a(tinn — &) + (6-1) In i 


and moment matrix 


1/1 1 
1(4 44) “i 
n\~P1 2 
1/1) 
-5 (5) 
0 


| 


0 bh 1 ( +4) 
N \Pr-1 pr) 


Least-squares estimators of a and b are found by minimizing the quadratic 
form 





ly — E(y)]'n'ly — E(y)], 
where the vector y’ = (1, Y2,°** Yr-t)- 
In view of the asymptotic distribution of the y;, these estimates are asymp- 
totically efficient relative to the y; . What information is lost in using the varia- 
bles y; rather than the original observations z,? If the original observations 
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were ungrouped, a slight loss of information would be caused by grouping to 
form the variables »; . Since the In gq; are monotone functions of the »; , no in- 
formation is lost in this transformation. The y; are linear combinations of the 
In gq; ; some further information is lost here in exchange for elimination of the 
factor K* from the estimation process. 

Since the true values of the p; are not known, it is necessary to replace the 
pi in M by their estimates, the g;. Introducing the notation 


(11) wi = yi — Inhy + Inhigs, 
(12) uN = fi41 _ &:, 
(13) v;, = Ing; — Im &4). 


the equations for a and b’ = b — 1 are 

(14) a (> a miu; uj) + (> 7 mou; v;) = ane MoU; W;, 
‘ j i j 7 3 

(15) a(> > mivu;v;) + 0’ Dd mi,0)) = DY mi,w;, 
‘ 7 ‘ d ’ 2 


ms’ denoting the elements of 31%" (9N* with p’s replaced by q’s). 
The solutions of these are 


(16) 4= [(v’onto'v)(u’onto'w) — (u’smto'v) (v’smo'w)], 
(17) as i [(u’onte'u) (v’ste"w) — (u’ote"v) (u’onts"w))], 
where 


A = (u’9M'u)(v’Mo'v) — (u’m'v)’, 
and the covariance matrix of (a, b’) is 
1 , — l / — 
A (v JIlo ‘y) a (u No ‘v) 
(18) 
—l ons) + (w’sns'u) 
oe ere 
The estimates 4 and 6’ are found by direct simple routine calculations ex- 
cept for the determination of 9% from 9%. This may be a tedious process 
unless r is small. However, if all p; are equal to 1/r, then 


(r-—-1 r—-2 r—-3 = 3 2 1 | 
r—2 2%r—2) 2%r—3) --- 6 4 2 
1 r—3 2&r—3) 3(r—3) -:-- 9 6 3 
—-M' =| : : 333 3 : ; |. 
. 3 6 9 3(r — 3) 2-3) r—3| 
| 2 4 6 2(r—3) 2r—2) r—2 
| 1 2 3 (r — 3) r—-2 r-—1) 


This is easily verified by direct multiplication. 
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It was pointed out that if 3% were known, least-squares estimates would be 
in some sense best among estimates which are functions of the y;. What does 
the substitution of 9%>° for 9° do to the estimates? Denoting by mj;, mj 
representative elements of SM, Mo , it is known that ms j converges in proba- 
bility to m;; as n — . Since 4, b’ are continuous functions of ms; , the re- 
sults of Mann and Wald cited above [10] are sufficient to conclude that 4(91%'), 
b/(91%") have the same limiting distribution as 4(9m°), 6’(9n™). Consequently, 
to terms of 1/n the variance-covariance matrix is given by (18). This result is, 
of course, closely related to similar ones in modified minimum x’ estimation; 
e.g., see Neyman [11]. 

For these asymptotic results to hold rigorously, it is necessary not only for 
n— ©, but also for maxi<i<r h; — 0 (so that (5) holds exactly), whch in turn 
implies r — «©. Furthermore, for the multinormality of the g; we need 
maxi<i<rh; — 0, r— © in such a way that mim<i<,npi —- ~ 

However, these considerations leave open the question of deter rmining r and 
the £; in any practical situation. While some studies have been made of the 
optimum allocation in linear regression problems (e.g., Elfving [12]), these 
refer to situations where the observations are independent. Moreover, our 
choice of the £; is limited by the requirement that the classes should not be so 
broad that (5) is seriously invalidated. Since 91%" is quite simple if the q; are 
all equal to 1/r, it seems to be reasonable to choose the £; so that this is so. 
The device is analogous to that suggested by Gumbel [13] and by Mann and 
Wald [14] in applying the x’ “goodness of fit’’ test. 

The fact that 9% involves reciprocals of the np; makes it seem desirable that 
no nq; should fall below 10. This will set an upper bound for r, namely, r S 
n/10. The lower bound should be determined so that (5) is a reasonable approxi- 
mation, though more usually it will be determined by considerations of the 
labor involved in calculating (16), (17), and (18). 

Of course, it will often happen that the data will be grouped to begin with, 
so that the statistician is not free to choose the &; or r. It should be noted that 
in any case simpler but less efficient estimates can be obtained by utilizing 
only the odd (or even) w,’s. The odd w,’s are mutually independent among 
themselves and consequently 9% and 9M reduce to diagonal matrices. 


5. Estimation with unknown origin. If the parameter c, the origin, is unknown, 
then the estimation problem is more difficult whether or not the distribution is 
truncated. Iterative methods are of course possible in solving (1) (2) and (3) 
with the aid of Table I, i.e., for the untruncated case. In the truncated case 
this method is too tedious to have much practical value. 

If, in the truncated case, there is available supplementary information so 
that the restriction 0 < c < & may be utilized, then a procedure similar to 
that outlined above may be followed. In this case 

c h; 


a 
a9) In p; — In pigs = O(Einr — &) + (6 — 1) In oe + In a 


(i = 1,2,---r—1) 
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again to the degree of approximation indicated by (5). With the restriction 
noted above, it is adequate to write 


é 
In ps — In pin = O(finn — &) + (6 - DI 
(20) +. 


1 1 h; 
+ (b-1 -=—=)+ bi .. 
Je (+ 2) 7 . hiss 


Defining y; , w; as above, least-squares estimates of a, b, and c may be found 
in a procedure exactly analogous to that of Section 4. 


6. Conclusion. The method used to estimate the parameters a and b in 
Section 4 may also be applied if the sample is drawn from a doubly truncated 
gamma distribution; from a singly or doubly truncated normal distribution; 
or from a beta distribution with known range, either truncated or not. Methods 
of obtaining the maximum likelihood estimates of the parameters of a truncated 
normal distribution are, of course, well known, and extensive tabulations have 
been made to facilitate the determination of such solutions (e.g., compare 
particularly Hald [15)). 

The method outlined above would also be useful in estimating the parameters 
of the normal curve where there are systematic gaps in the observations. This 
may occur particularly in time distributions—an example may be found in 
[16]. For distributions with finite but unknown range, however, the method 
does not appear to be satisfactory. 
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APPROXIMATE UPPER PERCENTAGE POINTS FOR EXTREME VALUES 
IN MULTINOMIAL SAMPLING 


By Ropert M. Koze.Ka 
Tufts College and Harvard University! 


1. Summary. Given a k-fold multinomial distribution with equal probability 
for each category, the probability of the largest frequency in any category is 
desired. A simple asymptotic approximation to the upper percentage points of 
this distribution is obtained. A table of .95 and .99 points of the approximation 
for k = 1(1)25, and a table comparing these with actual values for / 3, 4, 5 
and n = 3(1)12, are provided. An investigation of the moment problem is given. 


2. The approximation. The problem of testing for a significant difference 
between two observations has long been rather completely solved, but the ex 
tension to 3 or more observed values has only recently been comprehensively 
undertaken. Particularly, the problem of testing whether the largest observed 
categorical frequency in a multinomial distribution is significant is of interest 
to social scientists. One wishes to have a subject rate n situations on a k-point 
scale and then to inquire whether the number of situations occurring most 
frequently at a scale point is significant so that one might further study the 
properties of such situations. The extreme categorical frequencies are of interest, 
since they are the most valuable for further study; and the null hypothesis of 
equal categorical probabilities is the most likely beginning hypothesis for the 
social scientist in this situation. 

Let FP, , Fs, --- , Fy be the observed proportions of a sample of n objects into 
k multinomial categories with assumed equal probabilities. Using the well- 
known multivariate normal approximation to the multinomial one computes 
easily [3] that in this problem the observed proportions are asymptotically jointly 
multivariately normally distributed with means 1/k, variances (k — 1) / k’n, 
and covariances —1 / k’n. Let 


_ Fe—1/k 
~ (k-—1)/Rn 


be the corresponding standardized variable, and let E; represent the event 
t; = t*. From 


(1) t; (¢ = 1,2,--- ,k) 


Pr(max t; = {*) Pr(E, u E.u +++ u Ex) 


=> P(E.) — Pr(EE)) + — ---, 


i<j 


(2) 


and the fact that the partial sums alternate about the total sum, it follows that 


(3) So Pr(£) = Pr(max t; = t*) = DLOPr(B) — Dd Pr(££)). 
i<j 


Received March 8, 1954, revised July 1, 1955. 
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Since Pr(Z,E;) <= Pr(2£;)Pr(£;) and all categorical probabilities are equal, (3) 
reduces to 


Pr(max t; = t*) = k Pr(E,) — kh 1)[Pr(B)P. 


For (* sufficiently large, [Pr(/;)] is small enough to be neglected, and we have 
approximately 


(5) Pr(max t; = t*) k Pr(&;). 


Because of the asymptotic normality, it follows that 


. ins . se) — k * he? 
(6) Pr(max ¢; 2 i*) = an I. € dt 
for ¢* large. (The author is indebted to the referee for the above simplified proof.) 

Table 1 gives critical values of ¢ for .95 and .99 significance levels for / 
1,2, --- , 25. For selected values of k and n, table 2 gives a comparison between 
the approximate values of the actual frequencies and the computed values from 
the exact distributions. Since observed categorical frequencies must necessarily 
be integers, the approximation appears satisfactory even for small values of n. 
The fractional computed values were arrived at by spreading the probability 
for a given integral value over a unit interval extending one-half unit on each 
side of the given integer. Further computations by the author [2] indicate that 
the approximation decreases in accuracy for increasing k. This is suggested by 
(4) above. 


TABLET 


= 


.576 
807 
.936 
024 
090 
.144 
189 


997 


261 
.291 
317 
3.342 


bo 


.899 
.913 
.936 
.956 
.974 
991 
.008 
024 
038 
053 
065 
079 
3.090 
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TABLE 2 
Computed vs. approximate .05 and .01 values of upper percentage points for the 
largest observation from a mulltinominal sample 


k= 3 k=4 k=5 
Comp. Approx Comp Approx. Comp Approx. 
i. .05 3.05 2.9547 
.O1 3.41 3.3973 
“we .05 3.4562 3.5355 3.3166 3.1633 
.01 4.2298 4.1014 3.8597 3.6188 
— .05 4.1950 4.1902 3.7133 3.6687 3.4359 3.3041 
.O1 4.6890 4.7615 4.3960 4.1780 4.2375 3.7638 
wai .05 4.5707 4.7643 4.2614 4.1495 3.9530 3.7240 
01 5.3807 5.3902 4.9864 4.7074 4.4738 4.2276 
<a .05 5.2445 5.3191 4.5327 4.6118 4.3142 4.1262 
01 6.0499 5.9951 5.3996 5.2144 5.1213 4.6702 
ae .05 5.6751 5.8587 5.1404 5.0594 4.9447 4.5145 
.O1 6.4562 6.5813 5.9496 5.70637 5.4164 5.0960 
done .05 6.2542 6.3856 5.4363 5.4950 5.0802 4.8913 
01 7.177% 7.1521 6.3692 6.1783 5.6570 5.5081 
ve .05 6.6839 6.9021 5.9455 5.9205 5.4012 5.2584 
.O1 7.5219 7.7100 6.8255 6.6408 6.4798 5.9086 
ah 7.2368 7.4096 
.01 8 2369 8.2570 
n=12 05 | 7.6402 7.9094 | 
01 8.6580 8.7945 


| 





3. The moment problem. Greenwood and Glascow [1] have investigated the 


moments of the above distribution for k = 2 and 3. They arrived at exact and 
approximate means and variances for k = 2 and at approximate means and 
variances for a chosen pair in the k = 3 situation. An effort to extend their 


methods to the general case was almost completely unsatisfactory. 

For the case k = 3, the approximate probability density function correspond- 
ing to (6) provides a suitable approach to the moment-generating function. 
Assuming ft; 2 t2 2 t, one has approximately 


' 


3! “ © (3)4ts 
(7) mef (ts) = on e ; [ exp {— 4 (ts ass 6)°) dts; / ett dts ° 
“ /0 “0 


For @ sufficiently small, the integral over region III in Fig. 1 may be taken as 
approximately equal to the area of this triangular region. One has 


g 3 » (3/2) 40 7 2) /3 
(8) me)f (4) = exp (36) E + = | exp (—432°) dr + a v3 ] . 
2r Jo 2r 8 


and for small @ this is approximately 


(9) megf (t;) = exp (46°) E + ie 6+ ere ]. 
2V 24 oT 
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Expanding exp (}6°) in series and multiplying yields 


9 ./a 2 
(10) met (o) #1 + 3¥3 64 S/S V3 4a] 4... 
2 40 


Je 5 
V 24 - 


This is the mgf of ts = (F; — }4)/o3, where o3 = (1-2-1 / 3-3-n)"”, and hence 


(11) E(F3) = o3E (ts) + . = : 2: + .. 


2x 3 


Multiplying this result by n gives the expected value of the greatest number in 
any of the three categories. 
For the variance of F; we have 
(12) BUF) & ol) + BW) +5 = 5+ = +5 (gt as) 
' oe Ee) Tg BW TOOT Veen n\O" Seva): 


so that the proper subtraction gives 


2 3 1 07: 


3) var (F,) « On 4xn - 2an V3 . n 


This is in accord with approximations to the moments as performed in the 
thesis [2] of which this paper is a part. Two approximations were attempted: one 
by standardizing the variables and performing integrations of the resulting 
multivariate normal distribution; the other by approximating the sums in the 


t, 


a 
! 
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exact expected values by means of Stirling’s formula for the factorial. Both of 
these approximations gave the same results as the above moments for k = 3. 
For higher values of k, the first method became excessively laborious and the 
second broke down completely. 

An extension of the mgf technique even to the case k = 4 presents difficulties; 
analogous to (8) we have 


' 4! mF is 
mef (4) = 75am exp (26) [ exp [—3(t, — 6)] dts 
‘ us /0 
(14) (2) 42,4 i »(3)4t3 p 
| exp (— }t3) dts | exp (— 312) df. 
0 0 
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The regions I, II, and III are again available (Fig. 2), and the integral over I is 
still equal to unity; but simple approximations to the integrals over the other 
regions are not apparent. It is clear that for higher values of k these difficulties 
become serious and satisfactory approximations become less elementary. No 
effort has been made to evaluate the mgf for general values of k. 
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AN EXTENSION OF THE KOLMOGOROV DISTRIBUTION':’ 


By JEROME BLACKMAN 


Syracuse University 


, , 


1. Summary. Let 2; , t2, --* , 2n, U1, 22, °°", Lap be independent random 
variables with a common continuous distribution F(x). Let 2 , x2, --- , %, have 
the empiric distribution F,,(x) and 2; , to, °** , ten have the empiric distribu- 
tion G,,.(x). The exact values of P(—y < F,(s) — Gu(s) < x for all s) and 
P(—y < F(s) — F,(s) < x for all s) are obtained, as well as the first two terms 
of the asymptotic series for large n. 


2. Introduction. In a famous paper, Kolmogorov [10] showed that if F(x) is a 
continuous distribution function, 7, %2,--- , %,, °°: are independent random 
variables with distribution F(z), and F,,(x) is the empirical distribution based 
on the variables 2, , 7», --- , 2, , then 


2 
(1) lim P( sup |F,(x) — F(z)| < 4s = > (—1)’e _ 
now cece nl we 
Since then other proofs [4], [3], [6] have been given, and Chung [2], using the 
Kolmogorov method of proof, obtained an error term of the order of n 
which he then used to obfain a strong limit theorem. 

Smirnov obtained a result related to (1) when he showed [13] that 


. 7 7 —2,2 
(1’) lim P ( sup F,(x) — F(x) < ~mj=i-e. 
er) (<r 00 _ 

Actually Smirnov’s results are stronger, since he obtained an exact expression 
for the probability in (1’) (for finite n) as well as the first two terms of the 
asymptotic expansion. In an earlier paper, Smirnov showed also that 


, ‘ r \ = c e242 
(1 lim P{ sup |F,,(s) — G,,(s)| s =j= (—1)*e” 
) tim P (sup, Ia) — Gio! $Y) = 2-0 
under the condition \ > 0, n = mn2/ (nm, + nm) and n/n = rt. (See [13] for 


further references. ) 

More recently, Gnedenko, Korolyuk, Rvaéeva, and Mihalevié [7], [8], [9], 
[12] have developed a technique for treating problems of this sort by random- 
walk methods and have obtained error terms for (1”) under the condition n; = 
n,. We intend in this paper to develop their method further and apply it to 
obtaining exact expressions for the probabilities appearing in (1) and in (1”) 
under the condition that 7 is an integer. For completeness we are repeating 


Received January 14, 1955. 
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some of the work appearing in the above-mentioned papers. Korolyuk has 
recently published a lengthy paper [11] giving expressions for many of the 
probabilities we wish to treat. His results, however, differ from ours and indeed 
are not consistent with earlier-published work (Gnedenko, Doklady 82 (1952), 
pp. 525-528; also Math. Nachr., Vol. 12 (1954), pp. 29-63). 

Our principal results are the following two theorems. 

TuroreM 1. Let 21, 12, °°: , Xn, %1,22,°** » tne be a sequence of n(k + 1) 
independent random variables with a common continuous distribution. Let F(x) 
and G(x) be empiric distributions based on the first n and second kn random 
variables, respectively. Then 


nr 


(k + m . 


P(—2z < Gu(s) — F,(s) < y for alls) =1— ( 


[kn+8/a+8]) kn+a/a+s 


N(k,n,i(a@ + B) — B) + ee N(k,n, ula + B) — a) 


tal 


[kn/a+8] ) 


—-2 > Mk,n, ila + ))} 


t=1 


f~—-l1s x<y S 1, where a = —[—zkn]j, 8 = —[-—ykn] and 


N(k,n,a) = ) 2 ie iy re + 1)j+ " ‘(k + 1)(n — j) - “) 
Nk, Nn, jo (kK+1)j+a j \\ ma ; 
THEOREM 2. 
{1+y/z+y] 
P(—z < F,(s) — F(s) < y forall s) = om on(iz + (i — 1)y) 


tom] 


[1+2/z+y] 


(1/z+y] 
o((i — 1)x + iy) + 2 - ,(ix + iy) 


tal i=l 


<S2x<y S 1, where 


[n—zrn] cy ; , " 

* a xn n\ (j + xn)’((n — j) — an)””’ 

n(x) = js oe 
ju J T MN n” 

“rom these two theorems various limiting relations may be computed. 

I tk two theor I limiting relat y | ted 


3. Proofs and corollaries. Suppose given a collection of n(k + 1) independent 
random variables in two sequences: 


1,2, °** In; 
, , , 
T1,%25°** Duke 


Let F,,(x2) be the empirical distribution function, continuous on the right, with 

jumps 1/n at 2, 22, «++ 2,2, and let G,,(x) be the empirical distribution func- 
. . * * . , , , , . 

tion, continuous on the right, with jumps 1/nk at 2; , 22, --- Zax. We introduce 
the following notation: 


D* = sup(Gu(z) — F,(x)), 


D” = —inf,(Gu(r) — F,(x)) = sup.(F,(z) 
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The method used will involve finding the joint distribution of D* and D™ and 
then taking the limit as k — «. This will provide a proof of Theorem 1 and 
Theorem 2. 

Gnedenko and Korolyuk [7] introduced a technique for finding the joint dis- 
tribution of D* and D™ by considering a related random-walk problem. Order 
the x; and x; random variables in order of their numerical value and call the 
new sequence 


21522, °** Sn(k+1) - 
Let 
( . ° , , , 
)+1 if z; is from 2 , %2, °° Tnk, 
ve = k if et ee fr 
\—k if z; is from 2%, 22, °** In, 
and 
P 
Sp = > Oi. for p = 1,2, 


1 


Then it is not difficult to see that 


(2) P(D <y, D* <2) = P(-B < S; <a, j= 1,2,--- (K+ 1)n), 
where a = —|—zkn] and 8B = —[—ykn]. This reduces the problem to one of 
investigating a linear random walk with (k + 1)n steps which starts at 0, moves 
at each step either one unit to the right or k units to the left, and ends after 
(k + 1)n steps at 0 again. In [7], [8] the investigation was carried out for k = 
1, although the authors were apparently unaware that the k = 1 case had been 
treated extensively by Bachelier [1] in connection with certain gambler’s-ruin 
problems. Some results in the k > 1 case were obtained in [9]. Because of the 
independence of 2; , 22, °°: ,2n, 11,22, °** , Len, each path is equally likely, 
so that we essentially have only to count paths. 
Divide the class of all paths 9% into the following nonintersecting sets: 
@: paths reaching neither —8 nor a. 
@, : paths reaching a but not —8. 
@, : paths first reaching a (i.e., before reaching —§), then reaching at 
some subsequent step —8, but not thereafter reaching a. 
@; : paths first reaching a, then —§, then a, but not thereafter reaching 
—B. 
etc. 
The classes @;, 7 = 1, 2,--- are defined in the same way with —8 and a 
interchanged. For k and n fixed, the classes @; and ®, will be empty for 7 suffi- 
ciently large. Also, 


M = Ge + D(A; + B). 


The classes A; are defined as follows: 


A,: paths reaching a at least once, regardless of what happens at any 
other step. 
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: paths reaching a and —8 at least once in the order a,—§, regardless 
of what happens at any other step. 
A;: paths reaching a and —§ at least once in the order a,—§8,a, regard- 
less of what happens at any other step. 
etc. 
The classes B; are defined in the same way with a and —8 interchanged. 
Because of the equalities 


A=G+L@+8), B=G+L @ +), 


t—=2 ton? 


Ar = @ + Dd (@; + 8), B, = ®@ + z (@; + 8), 


tan3 t=n3 


etc., we have for arbitrary 7 = 1 


Ase + Boer — Ani — Bos = Gora + Boer + Ani + Bas, 
so that 
Q@ = M — 2 (Agi + Boga — As — Ba). 


Since Agi _ Ag; and Boy-1 - Ba; are disjoint, Ag-1 a Ag; ’ and Ba-1 “a B,; ’ 
we have 


N(Q) = N(M) — ms (N(Agi-a) + N(Basr) — N(Axi) — N(B3)), 


where N(A) is the cardinality of A. This formula was obtained in both [1] and 
[8], although the computation of the number of paths in the classes A; , B; was 
carried out only for k = 1, in which case a reflection principle will work. Let 
N(k, n, «) be the number of paths in the class A; . We now show that the num- 
ber of paths in A» is N(k, n, a + 8) by mapping the class A», in a 1:1 manner 
on the class of paths which cross a + 8 at least once. Note that if a path crosses 
a, it actually reaches the point a, since all steps to the right have length one. 
Also, if —8 is crossed from the right, the path must reach —§ on the subse- 
quent crossing from the left for the same reason. Divide the steps in an A, 
path into four parts pi , p:, ps, ps. pi consists of those steps from the first to 
the first step reaching a. p, consists of those steps from the first after p, to the 
first step actually ending at —8. p; consists of those from the first after p2 to 
the first step ending at 0, and p, consists of the remainder. The path with steps 
in the order pi, ps, pe, ps then crosses a + 8. Moreover, this path reaches a 
for the first time at the end of the p; steps, reaches a + £ for the first time at 
the end of the p; steps, and reaches 0 again for the first time at the end of the 
p2 steps. From this we conclude that the original A, path can be reconstructed 
from its image. Since the inverse mapping takes every path crossing a + 6 
into a path from A; , we find that the mapping is 1:1. 
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Using the same idea, the class A; can be put in 1:1 correspondence with the 
class of paths crossing 2a + £ at least once. In general, 


n{A Lee if (a + 8) — BS kn, 
b, Agi 
= 0 if i(a + B) — B > kn 
( = N(k, n, i(a + B) — a) if i(a + B) — a S kn, 
n{ Boi} 7 
= 0 if i(a + B) — a> kn. 
( = n{Ba} = N(k,n,i(a+8)) ifi(a+ 8) Ss kn, 


n{ Ao} 

= 0 if z(a@ + B) > kn. 

(k + 1)n 
n 


Since the total number of paths is( ), we therefore find 


P(D™ < y,D* < 2) 


(k 1 —1( ((kn+8) /(a+8)] 2 ; [(kn+a)/(aif 
a = - ( : “ ( > N(k,n,i(a + 8B) — 6) + re 


in t==l 
((km) /La+8)] 
‘N(k,n,ila+B)-a)—-2 >> WNk,n,ila+))s. 
t=] 
if -—1 Ss —x < y S 1 where a = —[—zkn], 8B = —[—ykn]. 

The computation for N(k, n, a) which follows is based on the work of Bache- 
lier ({1], pp. 101-103). The author is indebted to Dr. Warren Hirsch for point- 
ing out that the basic argument given for k = 1 was to be found there and 
could be extended to k > 1. 

Observe that the paths cannot cross the point a before a steps, and that in 
general a crossing can occur only after a + (k + 1) steps where i = 0, 1, 2, 
-++[n — a/k]. The upper limit on 7 is required by the fact that after the last 
possible crossing the path must still have enough steps left to return to the 
origin. Let M; be the total number of paths which cross at the a + (k + 1)7 
step and let M; be the number of paths which cross at a + (k + 1)i, but which 
have crossed at some earlier step. In order to cross at the a + (k + 1)i step, 
there must be a + ki positive steps to that point and 7 negative ones. Let 7; 
denote the total number of ways of combining these a + ki positive and 7 nega- 
tive steps in such a manner that the crossing at a + (k + 1): is the first cross- 
ing. Let M; denote the number of paths which at the a + (k + 1)i — 1 step 
are at a + k, and which at the a + (k + 1)i step cross the point a. 

In order for a path to be counted in either M; or Mj, it must cross @ for 
the first time at one of the steps a + (k + 1)i fori < j. Using this it is easy 


to see that 
£.< r.(® + IG - 7 + 1)(n — j) - 4 


i<j g= 1 n—Jj 
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and 
M;=> r.(“ +OU - T+ _ +I)m-p)- *. 
i<j j-t-l n—-Jj 
Since 


(* + 1I)G- “ +1)G-i-1)\"_ 


' : 3 , =k+1, we have 
j—1 J—-t—1 J 


M; = (k + 1)M;. 


However, Mj can be computed directly, since of the first (k + 1)j + a — 1 
steps 7 — 1 must be negative, and of the remaining (k + 1)(n — 7) — a steps 
(after the negative step which takes the path to the point a) there must be 
n — j negative: 


, kK+)jta-—-1\/k+1lm™—-j-—a 
M; = ° ° . 
j-1 n—J 


Therefore, since 


_(&+Djita\(k+)m—-j)-a 
M;; : ( n—j , 


M,— M,=|(“T 7 **)- @4 »(“ rey )| 


J 


J » 


(k+Da-7) - " 
( mn— J 


4 a (* + 1)7+ “)/ (k+1)(nm—j) - “ 


 (k+ij+a j n—-j 


Since M; — M; is the number of paths which cross at the (k + 1)j + a step 
for the first time, 


[n—a/k] . ¢ 
T (Ke ia & (kK+ 137+ — + 1)(n-j) - «) 
N (k,n, a) — 2, (ke + tl : " — . 


This proves Theorem 1. 
é : ; (k 1)n\",. : , 
We now irvestigate irae ( 7. N(k, n, ia + p8). A straightforward 
calculation shows that 


im fe + TC +1)j+ia+ “— + 1)(n — j) — ia — 6) 


how av } n—J 


e (”) (j + tan + pyn)’((n — j) — ixzn — pyn)””? 
Lg n” > on 


Therefore, 
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‘(k + Pe 
lim ( i ra N(k, n, ta + p8) 


kw 


[ rn—pyn!) 
(4) [m—1zn—pyn 


= "ten + pyn (*) 
j0)=— J + tan +pyn\J 
G+ an + pyn)(n = j) — izm — py) 
n” 
= on (ix + py). 
From (2), (3), and (4), using the fact that lim,.. G(x) = F(x) uniformly 
with probability one (Glivenko-Cantelli Theorem [5], p. 260), we see that 


lim P(D” < y,D" < x) = lim P(—y < Gu(s) — F,(s) < z for all s) 
k+0 


kon 


P(—y < F(s) — F,(s) < z for all s) 


{(1+y) /(z4+y)) [(1+z) /(z+y)] 
=1- DD ¢rix+ Gi — Dy) - 
i=l tal 


{1/(z+y)] 
‘dal((t — l)z + ty) + 2 z gon(ix + iy). 
This completes the proof of Theorem 2. 
It should be noted that 
P(F(s) — F,(s) < x for alls) = 1 — ¢,(z), 


and that this is exactly the result obtained by Smirnov [13]. In this paper Smir- 
nov also outlines the technique whereby it may be shown that under the con- 
ditions z > 2» > O and 2°/+/n = o(1), 


zr —Iz2 2 & x 


Combining this result with Theorem 2, we obtain the following corollary. 
Corotiary 1. If x > rm > 0, y > yo >, 0 2°/V/n = O(1), and 7°/V/n = 
o(1), then 


P(—y/n'” < F(s) — F,(s) < x/n’” for all s) 


oo 
—2(iz+(i—1)y)?2 —2((i—1)z+iy)2 —2(iz+iy)2 
=1-D fe +e — 2 


i=l 
1 o 
; ; \ 524 i—1)y)2 
+ 3 8 { (ix a (1 a l)y)e (iz+(i y 
le Jen 


4 (i is 1)z + se 


— Aix + iy") — of? {(iz + (i — Iyer 
1 


4 (Gi sie 1x 4+ iy) os ix + iyyreneorn | F 
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Using similar techniques one may take the limit as n — © in the result of 
Theorem 1. The computations become more complicated, and we will state 
only a somewhat weaker version than can actually be obtained. 

Coro.uary 2. For fixed x and y and k an integer, 

P(- yVE +1 - gi) — Fs) < TVET! 
</kn kn /kn 
viii il ~ l 
“aim > fe 2(iz+(i—1)y)2 + e 2((i—1) z+iy)?2 be Qe 2(iztiy)? , 

inl a/k( bk 4 1)n 

— J >. , ( k(k —1 
SE fw + ¢-no(1a-H- locviveEe Tes é— vw) 

Vk(k + 1) 


for all :) 


ton { 


gers 1)y)2 4 (a sac Lx f- F iy) 


‘ 2 ae k) m4 2((i—1z)+ ty)? 
( ‘ VEE FT 


— Wiz + iat (1—k) - 4Q(V nV kk + Ht + ty)" or —aietiv?|, 0(7.) 
Vk(k it 1) / n 


where the function Q(x) is defined by the equation 
—(-—z] = x + Q(z). 
The special case k = 1 is treated in [7]. 
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ON THE SIMULTANEOUS ANALYSIS OF VARIANCE TEST':? 
BY K. V. RAMACHANDRAN? 
University of North Carolina 


1. Summary. In this paper we have solved certain distribution problems con- 
nected with the simultaneous analysis of variance test [1] and have proved that 
the power function of this test has the monotonicity property. 


2. Introduction. It is well known that in situations involving the testing of the 
significance of k mean squares, the usual method of analysis of variance gives 
tests which are not independent. In these situations Ghosh [1] has recommended 
a test for the k mean squares which can be derived by the union-intersection 
principle [9]. 

The theory of simultaneous analysis of variance test and its use in certain 
problems in Public Health is given by Ghosh [1]. V’e shall be dealing only with 
certain distribution problems connected with the test and shall prove that the 


power function of the test has the monotonicity property. For further references, 
consult [6], [8], and [10]. 


3. Statement of the problem. Suppose we have k F-statistics 
(3.1) F; = (S,/S) os 


where S,, S:,---, S, and S are mutually independent, S;/o’ having a x’ 
distribution under the null hypothesis with ¢; d.f., and S/o” having a x’ distribu- 
tion with m d.f. For example, S; might be the sum of squares for row effects; 
S, , for column effects; and S, for error in a two-way layout. In the simultaneous 
analysis of variance situation [1] we are interested in evaluating 


(3.2) PIF; Sa;;t = 1,2,--- ,k 
for given a, or for a given a to find the a,’s such that 
(3.3) PIF; Sa;;t=1,2,--- kJ =1l-—a. 


The optimum choice of a; is not known. Ghosh [1] has intuitively suggested 
choosing a; as proportional to ¢;. A method of evaluating the probability on the 
left-hand side of (3.3) will be presented in Sections 4—6. The special case t; = 1 
(¢ = 1,2, --- , k) has been solved by Nair [3]. We shall also prove in Section 7 
that the power of the simultaneous analysis of variance test has the monotonicity 
property. 

Received Feb. 1, 1955. 
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4. Evaluation of the probability statement given on the left-hand side of (3.3). 
From (3.3) it is evident that 


. S ; 
(4.1) P| a a;—; ca sk] =l—-a. 
s m 


Hence the simultaneous analysis of variance test depends on the evaluation of 
expressions of the form 


by pbe (_k k oe 
(4.2) c [ see I] ef ‘ag. /|1 + Da ’ 
/0 “0 lea 1 / 
where c is a function of t, , fo, --- , & , and m and b; = (a,t;/m), 
= 1,2 ---.,&). 
Usually we will be interested in obtaining b, , bo , --- , b, such that 
ab; aby ( k . k (Zt ¢+m) /2) 
(4.3) c| vf «IT af »” aG./|1+ DG.| ne Et ow we 
J0 40 (i=l 1 ) 
This can be evaluated as follows: Denoting the left-hand side of (4.3) by 
I(b,, be, «++ be sth, te, ++, te 3m), 
we get, by integration by parts, 


I (bi, be, -*+ bes hb, “+ yt; mM) 


=~ Peltas tay +++ tas m) 


k 
(> t; + is 2) 
1 
be-a (k-1 k (Zt ¢+m—2) /2) by 
[ Laren agar /| 1 4 LG,| } 
0 


? \t=1 1 0 


te — 2 ct, te, «++, tk; m) 


2 2 ty + m — 2 
9 


~by be (kl atl aa k (Zt ¢-+m—2) /2) 
ae | i Git 2 Gi-O2 TT ag. /|1 + LG,| , 
/0 0 i=l 1 1 ) 


—byt” c(t, ’ te aed tk ; m) 
(1 + b,) atw*/2 (au+ m ee =) 


> 


~by/(14b_) ~bg_y/(14+by) (k—1 k—1 (Zt ¢+m—2) /2 
v(t ;—2)/2 y 
| ene SIT ai" ag. /|1 + LG,| 
\ 1 1 


“0 ~0 
+ _ ti io 2 c(t, te, nea » te 5 m) 
D ti tm — 2 clty, te, +++, ten, & — 2; m) 


*T(by, be, +++ beth, te, +++ tea, te — 2; m). 
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Successive reduction will leave us with the evaluation of integrals of the form 


(4.5) [" [° res me G;" ag. /|1 +> a| a 
“0 1 i - ) 


“0 “0 


Now it is easy to see that (4.5) is equivalent to 


© iV 1;V ev Vy (?-2) 72 j s 
(4.6) | dV / = ale | ——— + = II u; e “* duj. 
Jo 0 Jo Pp J 1 
: ( 3 
Also from [6] we get 
az a 
(47) J ute du = 22'7e** Do kPa! 
/0 0 
The convergence of the series on the right-hand side of (4.6) has been proved 
[7]. 
Thus the evaluation of (4.3) for given values of b; , be, --- , by can be car- 
ried out successively for different values of t,, t,--- ,t% and m by using the 


reduction formula (4.4). When m is large, (4.3) can be evaluated using tables of 
the incomplete gamma function [5]. 

It is easy to notice that the tabulation of (4.3) is rather tedious because of the 
large number of parameters involved. In the next section we shall consider the 
special but important case where t; = ¢ (i = 1, 2, --- , k). The general reduc- 
tion formula given by (4.4) can be used also when t; = ¢. But the special method 
which we shall use for the special case seems to be easier to handle than the use 
of the general formula. Also, it can easily be noticed that even though the 
method used for the special case generalises, the use of this method is rather 
tedious. Hence for the general case the general reduction formula is to be used; 
and for the special case, the special method to be discussed in the next section 
is to be used. 


5. Special case when /; = ¢ (i = 1,2, --- , k). In this case we have to obtain a 
“b” such that 
»b ~b¢ k k (kt+m)/2) 
(6.1) lL—a=ck,t;m) |] -:- (ei ag. /|1 + x G| , 
<0 /0 1 = / 


where 


r CG - =) 
(a) (3) 


It is evident that (5.1) can be rewritten as 


clk, t; m) = 


S . a. 
(5.2) p|* < b,i = 1,2, ssh | = p| Se < | =1l—a. 
l A A 
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Let us call the statistics u. = Smx/S, the studentized largest chi-square. In 
order to obtain a ‘“b” such that (5.2) is satisfied, we shall study the distribution 
of the studentized largest chi-square. 

We shall derive in the next section certain mathematical results which we shall 
use to derive the distribution of u, . 


6. Power series expansion for the incomplete gamma-type integrals. Let 


z n —u/2 k 
1 (n, ks * | we . 
aie = | o 2 T(n + D™ 


Using methods similar to those given in [7], we find that an appropriate expan- 
sion for I(n, k; x) is given by 
k(n+1) ‘s 1 baal ¢ oo 
a exp [—4(n + 1)kx/(n + 2)] (k) 4 
(62) I(@,ksz) = GO > Set X Afr’, 


where the A’s satisfy the recurrence relation 


(k) 1 
at E + k(n + >| 


eee es ot a 8 - 
(63) -[4 ease te + ae pay | 


tome «= = 0,1,2,--4) 
Notice that As” = ] and Ay" = 0 for all k. 


. g ‘* . . 
The convergence of the series }-¢ A$“’x' is proved in the Appendix, 


7. Distribution of the studentized largest chi-square. The p.d.f. of Smgx = 0 i8 


ky? 2e-2/2 ed zt?) /2e—-z/2 1 


et | a de 
ri.) Jy a" Pr(s 


kty@t—®/2 


using (6.2). And the p.d.f. of S = 


plv) = 


(7.2) 


Multiplying (7.1) and (7.2), using the transformation u = (v/y) and integrating 
with respect to y in the interval 0 to ©, we get 


4 


~ ago Ht +m =a. i} ayes 

r (42) y 2) x (n) * f kt +2 = 
I — + ——u4 
{+2 


(7.3) plu) = 
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TABLE 1 
Upper 5 per cent points of u = Smax/S for different values of m and t 
(see formula 7.3) when k = 2 
‘ 
m “ gee ——— - — atte eo ee =" 
1 | 2 3 4 6 8 10 12 16 20 
5 9.55 | 7.88 | 7.15 | 6.70 | 6.20 | 5.92 | 5.72 | 5.59 | 5.42 | 5.28 
6 8.50 | 6.90 | 6.21 | 5.79 | 5.32 | 5.04 | 4.87 | 4.75 | 4.59 | 4.46 
7 7.84 | 6.28 | 5.62 | 5.21 | 4.76 | 4.50 | 4.35 | 4.22 | 4.09 | 3.96 
8 7.39 | 5.86 | 5.23 | 4.81 | 4.38 | 4.14 | 3.98 | 3.86 | 3.74 | 3.61 
10 6.81 | 5.32 | 4.70 | 4.32 | 3.90 | 3.66 | 3.51 | 3.40 | 3.28 | 3.16 
12 6.43 | 4.99 | 4.38 | 4.01 | 3.61 | 3.38 | 3.23 | 3.12 | 3.00 | 2.88 
16 6.02 | 4.62 | 4.02 | 3.66 3.27 3.05 2.90 | 2.79 | 2.66 | 2.56 
20 5.81 | 4.41 | 3.81 | 3.46 | 3.06 | 2.85 | 2.71 | 2.60 | 2.47 | 2.37 
+ 24 5.66 | 4.29 | 3.69 | 3.34 | 2.95 | 2.74 | 2.59 | 2.49 | 2.36 | 2.25 
a 5.02 3.69 | 3.12 | 2.79 | 2.41 | 2.19 | 2.05 | 1.94 | 1.80 | 1.71 


From (7.3) it is evident that the distribution of u can be tabulated using tables 
of the incomplete beta function [4]. Upper 5 per cent points of u are given in 
Table 1 for k = 2 and for different values of t and m. The methods presented in 
Sections 4-7 will enable us to evaluate integrals of the form 


e ~biy 


[| 


0 “0 


» bay 
(7.4) | 


k 
ye” dy [J xf ie**? -dz;. 
1 


These integrals are found to be useful in obtaining lower bounds to the power 
of the Hartley test for equality of several variances from univariate normal popu- 
lations [2]. 

8. Power function of the simultaneous analysis of variance test. In this situa- 
tion, S;/o° has a noncentral x’ distribution with ¢; d.f. and noncentrality param- 
eter \; (i = 1,2, --- , k) und S/o’ has a x’ distribution with m d.f. We shall 
now prove the following theorem. 

THEOREM. The power function of the simultaneous analysis of variance test is a 
monotonic increasing function of the absolute value of the square root of each of the 
deviation parameters di , \2, «++ , Ax separately. 


Proor. The second kind of error of the simultaneous analysis of variance test 
can easily be shown to be equal to 


- té m - ts m 
ef exp] -4(X - #,+ Du) | : dz., I] dys, 


a 


(8.1) Bs = 


where the domain of integration D is given by 


had 2 ti ‘ 
= (2. + Vu) + do tis 


° 
lA 


Sh > i 


S (um, + Vm)? + Dak, she DV yi | 


2 1 


and c > 0 is a pure constant independent of the ’s. 
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It is easy to see that each \ enters 8 in the same way. Hence we shall prove the 
theorem only for \;. For any other \;, the theorem is immediate. Notice that 
Ai occurs only with 2, . From (8.1) we get the limits of 2, to be 


19 


m etl 1/2 h m 2 ti ik 1/2 P 
(8.2) — (0 » Yi ~- 2 zi.) on Vx: Sin 3 (0, _ ee 2. zi) — Re. 


2 1 2 


Now in (8.1) let us first perform the integration over zy, . The contribution to 
the total p.d.f. made by zy is const. exp (— 4211). The upper and lower limits of 
the x, integration are J, and I, , given by 

m ti \ 1/2 
(8.3) , = (0, x vi —- zis) —Vnd, 
and 


m ti 2 
l, —(b, x yi = > ri.) — V1: 


9 


If we now differentiate with respect to ~/d; , we get, through the x), integral, 
an integrand which is 


(8.4) exp ( _&) — exp ( 8). 


For all positive values of ~/); , the expression in (8.4) will be negative; and for 
all negative values of +/\, , it will be positive. Thus 


if Wy, >0, and >O0 if Va <9. 


Similarly for any other \; . Therefore the second kind of error of the simultane- 

ous analysis of variance test is a decreasing functon of each | ~/); | separately, 

and consequently the power of the test (=1 — 8) is an increasing function of 
4/x; | separately. Hence the theorem. 


9. Concluding remarks. The distribution of the studentized largest chi-square 
given in (7.3) has been noticed to be useful in obtaining useful simultaneous con- 
fidence bounds on certain parameters connected with the main effects of fac- 
torial experiments having m factors at s levels each. Extensive tables of the dis- 
tribution of the studentized largest chi-square are being prepared and will be 
offered elsewhere. Using techniques similar to that given in Section 8, it is pos- 
sible to prove that the power function of the test for the hypothesis that the m 
main effects of the s” factorial experiments are simultaneously zero has the 
monotonicity property. 


10. Acknowledgment. The author wishes to express his indebtedness to Prof. 
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APPENDIX 


Convergence of the series on the right-hand side of (6.2). Consider 


-z n u ntl 
I(n;z) = = Ta du = rv “re 
Jo n n+2 
(A.1) % 
- exp [—(n + 1)x/(n + 2)] Zz A}? a", 
0 
where 
w ; | (-1)' 1 1 a 
(A.2 Av? }1 a . ———en Al 
| +44] v! ata tnd? 


Since we will be interested in cases where n is of the form (r/2) 
(r Ps i, @, l, eve), 


we shall prove the convergence of the series on the right-hand side of (A.1) for 
the case n = (r/2) (r —1,0,1, ---). The case when r = —1 has been al- 
ready considered [7]. 

Cass l.n = 0, i.e., r = 0. 

In this case, 


Ase. i 0 (i = 0, 1, 2, ) 
(A.3) 1 
os : De ta Y on 
ae" 955... GD 6 = 1,2, ---) 
and 
Ay’ = 1 
Hence 
a) 
As 1 1 
(A.4) ——— i ao <a. 
ASD, 4:20(20+1) ~ 167 


Consequently by A’ is convergent, and the value of the ratio of the ith to 
the (i — 1)th term of the power series in (A.1) is less than 2°/167°. Therefore in 
this situation the series }“f Az‘ is absolutely convergent, and so the powers 
of the series are also convergent. It may be noticed that the series (6.2) is rather 
rapidly convergent, so that for a relatively small z only a few terms of the series 
will suffice for any degree of accuracy desired in practice. 

CasE 2.2 > 0,1.e.,r > 0. 

Now from (A.2), after a little simplification, we get 


(n+1)° (n+1)! 








sss. Sty St 

(A5) 7 (n + 2)'(n +241)! 
A5 [23 1 (n+ 2)(n + 3) te (-1)' (vm +a! _ | 
n+1 2! in + 1)? 3! - il (n Flin +)’ 
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Hence 
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A}' n+ 1 


AM (n+2)n +741) 


n+1 
1 


=(nFi) 
sum of first 7 terms in (1 + a) 


—(n+1) 
sum of first (¢ + 1) terms in (a + L ) 


Therefore if 7 is large, 


(A.7) 


a” 4 
(1) < . 
A i—1 v 


Consequently >>¢ A; is convergent, and the value of the ratio of the ith to the 
(¢ — 1)th term of the power series in (A.1) is less than z/i. Hence in this situa- 
tion the series >-¢ A{’z' is absolutely convergent, and therefore the powers of 
the series are also convergent. 
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NOTES 


ON THE DISTRIBUTION OF THE LIKELIHOOD RATIO 


By Rosert V. Hoce 
State University of Iowa 


1. Summary. In an investigation of the distribution of the likelihood ratio 
\, Wilks [3] proved, under certain regularity conditions, that —2 In X is, except 
for terms of order 1/+/n, distributed like x’ with k — m degrees of freedom, where 
k is the dimension of the parameter space 2 of admissible hypotheses and m is 
the dimension of the parameter space w of null hypotheses. In this paper, we 
consider the nonregular densities investigated by R. C. Davis [1] and show that 
for certain hypotheses —2 In \ has an exact x’-distribution with 2(k — m) 
degrees of freedom. 


2. A lemma. We find it convenient to prove the following lemma first. 


LemMa. Let the k independent random variables, w; , We ,+-- , W; , have the joint 
density function 


k 
nj—l 
II (n,w?), 0 < w; < 1. 
i=l 
» SE ot ens = «i ohne © bia 1) . 
Let u ind Wi* and 1 u/s", where s = max (wi, We,°*+ , We) and n = 


> int n;. Then —2 In u and —2 In v have x-distributions with 2k and 2(k — 1) 
degrees of freedom, respectively. 

Proor. Obviously w?‘ has the density 1,0 < w7' < 1; thus, —2 Inw7‘ hasa 
x -distribution with 2 degrees of freedom. Since —2 In u is the sum of k inde- 
pendent x’ variables, each with 2 degrees of freedom, then —2 In u has a x’- 
distribution with 2k degrees of freedom. This completes the proof of the first 
part of the lemma. 

We note that s” has the density 1, 0 < s" < 1. Thus, —2 In s” has a x’- 
distribution with 2 degrees of freedom. We can show as follows that v and s 
are stochastically independent. Let us introduce the parameter b in the joint 


density: 
k 
(II Nj wi) / b", 0 < w; < b. 
t= 


The variable s is the sufficient statistic for b, and its density ns” / b", 0 <s < 
b, is complete. The distribution of the ratio v is obviously free of the parameter 
b. These facts imply, by use of an extension of a theorem of Neyman [2], that 
v and s are stochastically independent. Since we can write —2 In u = —2 In 
v — 2 1n 8", we find that —2 In v has a x’-distribution with 2(k — 1) degrees 
of freedom. 
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3. One extremity of the range depending on 6. Let x possess the probability 
density function 
(6) P(x), asx s b(8), 
fla; 0) = (2) x) =! (@) 
\o, elsewhere, 
where P(x) is a real single-valued positive continuous function of z defined 
almost everywhere and b(@) is a strictly monotone continuous function of 6 
for some interval of values of 6. Of course, 
b(8) 


(1) [Q(@)\" = | P(x) dz; 
a 

thus, Q(@) is a strictly monotone continuous function of 6. Consider the k, k = 
1, 2, 3, --- , mutually independent populations having the densities f(x; 6;), 
t = 1, 2, 3,---,k. We test, by use of the likelihood ratio \, the hypothesis 
6, = 0. = --- = & = 6, where 6 is some specified value, against all possible 
alternatives. Let n,, m2,---, m be the sample sizes and let 2, 22.,-°-:, % 
be respectively the largest items in the several samples. Thus, t; = b '(z;) 
t= 1,2, --- , k, is the maximum likelihood estimate of 6; and hence 


’ 


yk 
Hi ml MG 
1 a ay? 


II a)” 


tal 


By using (1), 


= II ir Q() P(x) ar) 


tal 


If the null hypothesis is true, 


w= | Q06) Pla) dx 


is distributed like the largest item of a sample from a uniform density with 

domain zero to one; that is, w; has the density nw?*", 0 < w; < 1. So, by 

use of the lemma, —2 In \ has a x’-distribution with 2k degrees of freedom. 
We now take k greater than one and test, by use of the likelihood ratio X, 


the hypothesis 6; = 6. = --- = 6, against all possible alternatives. Here, 
yk 

Qn?" 

A= J : 

Tl Qe)” 


t==l 


on ae | 
where t; = b '(z;), z = max(z,, 22, +++, 2), andt = b (z). Hence, 
ni 


[oo P(x) dx 


t= 


[ e@ Pe) az 
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If the nul] hypothesis is true and if @ represents that common, but unknown, 
value of the parameter, we argue, by using the lemma, that —2 In \ has a x- 
distribution with 2(k — 1) degrees of freedom. 


4. Both extremities of the range depending on 6. Let x possess the probability 
density function 


(Q(0) P(x), O52 b(8), 
0, elsewhere, 


F(x; 8) 


where P(2) is a real single-valued positive continuous function of x defined almost 
everywhere and 6(@) is a strictly monotone decreasing continuous function of 6 
for some interval of values of @. Again, 


b(6) 

(2) (Q(@)\" = | P(x) dz; 

so Q(@) is a strictly monotone increasing continuous function of @. Consider the 
k, k 1, 2, 3, --- , mutually independent populations having the densities 
f(x; 6;), 7 1,2, --- ,k. We test, by use of the likelihood ratio \, the hypothe- 
sis 0, 6. see 0 6) , where 6 is some specified value, against all possible 
alternatives. Let nm, m,-+-* , m be the sample sizes. Let y:, yo, --: , ye and 
21, 22, °** , 2 be respectively the smallest and largest items in the samples. 
Therefore, ¢; min{y;, 6 (z,)}, i = 1, 2,---, k, is the maximum likelihood 


estimate of 6; and hence 


X : = 
TT Qw)" 
or 
k / b(t,) ni 
=I] ( I - QG)PCa) ac) 


If the null hypothesis is true, we observe that 


Plt; 2 r] = Plyi 2 r,e: S b(r)], 


([ air) ar)”. 


Hence, 


f pb(t;i) \ ns 
ait as (| Q (0) P(x) az) 
t 


has a uniform density over (0, 1), or w; has the density nwi*", 0 < w;< 1. 
Thus, according to the lemma, —2 In X has a x’-distribution with 2k degrees of 
freedom. Similarly, if we require k to be greater than one, we can show that if 
\ is the likelihood ratio for the hypothesis ¢, = 6. = --- = 6, then —2 In 
\ has a x’-distribution with 2(k — 1) degrees of freedom when the null hypothesis 
is true. 
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In the cases presented above, the dimension, m, of the parameter space w 
of the null hypothesis is either 0 or 1. This can be extended somewhat. If the 
null hypothesis is that the 6’s fall into m equal sets, —2 In \ is distributed as 
x’ with 2(k — m) degrees of freedom provided the null hypothesis is true. For 
example, suppose k = 6 and that we test the hypothesis 6, = 6. = 6; = 6 and 
6; = 65 against all possible alternatives. Then —2 In \ has a x’-distribution with 
2(6 — 2) = 8 degrees of freedom. 
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AN APPLICATION OF CHUNG’S LEMMA TO THE KIEFER-WOLFOWITZ 
STOCHASTIC APPROXIMATION PROCEDURE! 


By Crrus DeRMAN” 
Syracuse University 


1. Summary. Let M(x) be a strictly increasing regression function for x < 8, 
and strictly decreasing regression function for xz > 6. Under conditions 1, 2, 
and 3 given below, the stochastic approximation procedure proposed by Kiefer 
and Wolfowitz [3] is shown to converge stochastically to 6. Under the additional 
conditions 4, 5, 6 given below, the procedure is shown to converge in distribution 
to the normal distribution. Our method is the one used by Chung [2]. 


2. Introduction. Let H(y|x) be a family of distribution functions which 
depend on the parameter x and let M(x) = f®.. y dH(y | x). Suppose M(z) is 
strictly increasing for x < 6, and strictly decreasing for x > 6. Let {a,} and 
{cn} be sequences of positive numbers such that 


Cn, — 0, Yan = 0, Dann _— 2 >ancn < we, 


Kiefer and Wolfowitz [3] suggested a recursive scheme for estimating @ which 
is as follows. Let z, be an arbitrary real number. For all positive integral n, 


(1) Zntt =Z,+ = (Yon — Yen-1), 


where jn: and ye, are independent chance variables with respective distribu- 
tions H(y|z, + cn) and H(y|z, — c,). Under certain regularity conditions 
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they proved that Z, converges stochastically to @ as n — «. Blum [I], later, 
under slightly weaker conditions proved convergence with probability one. 
However, the regularity conditions imposed are such, that they are not 
satisfied if M(x) = K — K’ (x — 6)’ where K and K’ are any constants (K’ > 0). 
Since this case is a common one, it seems important that it be considered. Below, 
we shall prove some convergence theorems under conditions which are quite 
restrictive. However, the above function, and functions of a slightly more general 
type, will satisfy these conditions. Since our purpose is to show that the Kiefer- 
Wolfowitz procedure is applicable in the parabolic case, no attempt will be 
made here to weaken the conditions. 

The main tool to be used is the following lemma proved by Chung [2] which he 
used in his analysis of the Robbins-Monro [4] procedure. 

Lemma. Suppose {b,},n = 1 is a sequence of real numbers such that forn = no, 


bar 2 Il —c/n')b, +a,/n', whreO<s<il,s<tec>O0,cq > 0. 
Then lim, + nb, = 4 /c. If 


(2) asa € - cn) b, + = 
n* 


n 
whereQ<s<1,s<t,ce, 2c¢>0,c2 < 0, then lim, ..n'"b, S @ /€c. 


We remark that if in (2) c. is replaced by a sequence {c2,} of positive numbers 
es r 
such that c, — 0, then lim,...n° “b, S 0. 


3. A Convergence Theorem. We postulate the following conditions. 


Conp1TIon 1. There exist positive constants K, , Kz , and Cy such that for every 
c, where0 <c < Co 


—cK.(x — 6) S (M(a +c) — M(x — c))(x — 0) S — cKi(x — 0)’. 


The above is a condition that the function M(z) does not increase (decrease) 
too rapidly or too slowly. We remark that for M(x) = K — K'(x — 6)’, 
K, = K, = K’. 

Conpiti0on 2. Let o (rz) = J*%. (y — M(a))’ dH(y\zx). There exist real num- 
bers M, and Mz such that 


0 < M, < a (x) < M, < @. 


ConpiTi0n 3. Let » and « be any two real numbers such that n > « > 0 and 
nte< }. We seta, = 1/n'‘ andc, = 1/ s, 
TueEoreM 1. If Conditions 1, 2 and 3 hold, then 


M i lai — ong M2 
— < lim n”d, Ss limn” bd, Ss 


= = aa 
n-~2 nw~w@ K, 


Proor. From (1) we have 


(3) where b, = E(Z, — 0)’. 


2, 2 ~° 
(4) Dn4t - bn + = E (Yon = Yon—1) (Zp - 1) + = E (Yen er Yon—1) - 


2 
Cn 


It follows easily from Condition 1 that 
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(5) —CnKobn = E (yon —_ Yon—1) (Zn sal 6) s = CnK ibn , 
also from Conditions 1 and 2 we have 


(6) 2M, + cnKibn S E(yon — Yon) S 2M2 + crKibn. 


Therefore from (4), (5), (6) and Condition 3 we get 
ba(1 a =) 2M: — 2K, , Ki ) 2M, 


ni-e nit2(a—«) = ni-« n2l-o 


nit = b 
2K kK? 2M 


ni-« n2a-«) 


K; + 2M, 
ni-e nit2(-«) 7 


For any 6 > 0, there exists an m such that for n > nm, 2K, — K3/n'‘ = 
2(K; — 4). Therefore, by the second part of Chung’s lemma, lim, ... n°” ‘b, < 
M; / K, — 6. But since 6 is arbitrary, the right side of (3) follows. The left side of 
(3) follows immediately from (7) and the first part of Chung’s Lemma. 

It is a corollary of Theorem 1 that Z, converges stochastically to @asn— ©, 
We remark that for stochastic convergence we need not impose the condition 
that M, > 0. 


nt2(a-©) 


4. Convergence to the Normal Distribution. Let 
B° (rz) = f*.\y — M(zx)|' dH(y\zx) for r > 0. 


We shall need the following condition on the 6’s. 
ConpiTion 4. There exist real numbers M,(r) and M,(r) such that for all z, 


0 < Mi(r) s B(x) < Mr) < &, r 1, 2 «- 


We shall continue to denote M,(2) and M,(2) by M,; and M,; respectively. 
Lemma 1. If Conditions 1, 3 and 4 hold, then 


k one “rn 
(2k — 1)(2k — 3) --- 3-1 (**) < lim n*—"— 


Ke 
(8) 


_——— 9 9 (2 4 k 
< lim n™"“9°" < (2k — 1)(2k — 3) --- 3-1 (#) , k=1,2,--- 
n> 1 


n~wo 


where by” = E(Z, — 0)’. 
Proor. The proof is by induction. By theorem 1, (8) is true when k 
From (1) and Condition 3 we have for any positive integer r, 


r 


bY? aad ~~ + nilt+ae E(Z, _ 0)" (Yon — Yon—1) 


—_ 1) fF r—2 2 
E(Z, — 6)" “(Yen — Yon-1) 


] 


nia 


E(Z, — 0)"~? (Yan — Yeni)’. 


j=3 
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Imposing Conditions 1, 3 and 4 and utilizing the induction assumption, (9) 
yields for even r, 


ni) < bo (1 _ Ky r(r — }) K; ) 


n+l ni-e 9 n2—o 


4 rr — 1) --- o*(z 


n7 (1-412) +0—©) Ki 


The inequality on the right side of (8) follows, as before, from an application of 
Chung’s lemma. The inequality on the left can be obtained in a similar manner, 
replacing Mz by M, and K;, by Kz, reversing the inequalities, and again applying 
Chung’s lemma. This proves the lemma. 

ConpiTI0n 5. o°(x) is continuous at x = 80. 


ConpiTIon 6. There exist positive numbers 5, L, and Cy such that for allc, 0 < 
eS & Co 


(r—2)/2 
) ofp o(n —— 


M(x + c) — M(x — c) = cK(zx)(x — 8) 


where 

—K' — Wir — 0|' Ss K(x) S$ — K'+ Wiz —0|' for |x — 6| S I 
—K, = K(x) S — Ky for jr — 0 >L 

and where W and K’ are positive number such that — K’' + WL’ < — K, and 


— K' — WL’ => —- Ez. 

Condition 6 is a strengthening of Condition 1 to the extent that locally at 
x = 6, M(zx) is parabolic. 

Lemma 2. If Conditions 2, 3, 4, 5, 6 hold, then 


\ . 2n—e b, ie 
(10) = K’ 


Proor. From Condition 6 we have 


A 


c,(—K'b, — WE|Z, — 0\°**) S E(yon — Yon-1)(Zn — 9) 


lA 


> rDIF \2+6 
Cn(—K'b, + WE|Z, — 6\™). 
From Lemma | and Lyapunov’s inequality for n large enough and for some k 


(2+ 5) /2k 

n 2+5 (2k)\ (248) /2k Ry bn 
om < PE nineteen 

Zn 6 = (b;, ) = Ryn@-045) 


(11) E 





where R, denotes the upper bound in (8) and R; denotes the lower bound in (8) 
for k = 1. Also, 


E(Yen _ Yor-1) = E(o'(Z, + Ca) + o(Zn aid Cn)) + cyKibn 


where |y| < 1. But since Z, converges stochastically to 6, c, — 0, and o’(z) is 
bounded and continuous at 6, we have using (11) 
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( 


° a3 
(12) ba = ba 1 — —— (KK — wy) + — + = (6) + dp 
? 


ni-e y2(l—e) nit2(n «) ’ 
where both w, and d, tend to zero as n — ©. Both parts of Chung’s lemma can 
be applied to (12). Therefore, (10) follows. 

Lemma 3. If Conditions 2, 3, 4, 5, 6 hold, then lim,.. nn" *  —» 0. 

Proor. It follows from (1) and Condition 6 that 


bo? (4 - Fr’) ae Biz. —o ¢ bei 


wn ni- 
< 1 (1 = )+ " E |Z, — 0\'*". 


ni-¢ n) € 


(13) 


But by Lemma 1 and Lyapunov’s inequality, there is a constant R > 0 depend- 
ent on 6 such that 


(14) iz, — ot? < __* 


= nits) n—e/2) ° 


From (13) and (14) and the remark following Chung’s lemma it follows that 


(15) lim n”~“*b%” < 0. 


n-o 
Also, it can be shown in the same way that 
(16) lim n”™~“?(—b*”) < O. 
n-o 

Lemma 3 follows from (15) and (16). 

THEOREM 2. If Conditions 2, 3, 4, 5, 6 hold, then n* “*(Z, — 6) converges in 
distribution to a normal distribution with mean zero and variance o°(6)/K’. 

Proor. By using induction, it follows from lemmas 1, 2 and 3, and Lyapunov’s 
inequality that 


lim n°"? pf = (° -) ¢ — 1)\(r — 3) --- 3-1 for r even, and 


K’ 
= 0 for r odd 
The method of induction is similar to that used in proving Lemma 1. We shall 
omit the details. However, (17) indicates that the moments of n” “*(Z, — @) 


converge to the moments of the indicated normal distribution. The result 
follows from the well-known theorera on the convergence of moments. 


(17)  »+% 
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SOME RESULTS ON RESTRICTED OCCUPANCY THEORY’! 
By Joun E. FreuNpD AND ARTHUR N. PozNER 
Virginia Polytechnic Institute 
1. Introduction. If k judges rate a product on a discrete scale, say 0, 1,2, --- , 
or m, it is not only important to know the average rating assigned to the product, 
but it is also important to know the consistency of the ratings. Whereas the 
average rating is r/k, where r is the total number of points assigned to the 


product by the k judges, it seems reasonable to use the variance of the ratings, 
i.€., 


(1) s = y -T) (*) 
| 7 > (i 4 kK)? 


as a measure of consistency. Here v; stands for the number of judges who gave 
the product the rating 7. 

In order to test the consistency of such ratings, it will be necessary to find a 
suitable mathematical model which will assign low probabilities to cases in 
which the ratings are either very inconsistent or overly consistent. It is felt that 
the appropriate model is provided by that of restricted occupancy theory, in 
which we consider as equally likely all possible distributions of r indistinguish- 
able objects among k cells with at most m objects per cell. With this model we 
shall then test whether it is reasonable to suppose that the r points given by the 
k judges are randomly distributed among the k judges. 

Related problems dealing with the probability that x cells contain more than 
q objects when r objects are distributed among fk cells, and the probaility that 
x cells contain m objects when r objects are distributed among k cells with at 
most m objects per cell were investigated by Baticle, [1], [2], and [3], with refer- 
ence to applications to casualty insurance, merchandizing, and transportation. 


2. A restricted occupancy distribution. Restricted occupancy theory deals 
with the distribution of r objects among fk cells if a maximum of m objects is 
permitted per cell, the cells are distinguishable, and empty cells are permitted. 
In this paper we shall investigate the distribution of the variables 
v; (¢ = 0, 1, 2, --- , m), standing for the number of cells occupied by 7 objects, 
respectively, if r indistinguishable objects are distributed among k cells with a 
maximum of m objects per cell. 

Since r and k are assumed to be fixed, it should be noted that the variables 
v; are subject to the two linear restrictions 


(2) z v,=k and > i-v; =f, 


t—_0 t=0 


and any m — 1 of the v; will thus determine the remaining two. 
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If we now write as N(r, k, m) the total number of ways in which r indistin- 
guishable objects can be distributed among k cells with at most m objects per 
cell, we find that this restricted occupancy coefficient is the coefficient of x’ in 


(3) f(a;k,m)=A+re+e4+---4+2. 


Since 


f(a; k,m) = (1 — 2™™")*(1 — 2) 


: (1 as "7 ss 


(1 — a"). 


we can write explicitly 


k ° L, ao i oa 
(4) N(r,kym) = & (-0 (") (* oe ae 
j=n0 c— 


If we now let i, i2,+--, ims stand for an arbitrary permutation of any 


m — 1 of the m + 1 numbers 0, 1, 2, --- , m, the joint distribution of the v 
for 7 = 1, 2,---,m — 1, may be written as 


‘j) 


m—1 


( k ) [k- a vi; 
(5) Vin Ving °° * Vinny = 


: eee . = q 
fv, ’ ’ Vin-1) N(r, k, m) 
where 
m—1 
keimgs — 7 — Dy (imu — ii, 
q = taht ta a cit atae 


tm+1 — tm 


( k ) 
Vis Vigy °° 9 Vig is 


is a multinomial coefficient. Also, 7,, and 7,4; are the two numbers which were 
omitted when choosing the m — 1 numbers 7; . 

The moments of the v;, may be obtained directly from (5), but their derivation 
is simplified if we refer to the generating function given in (3). Writing 


f(x; k, m) feo + tote tees +2” —2')}* 
k 


= > (“) elt ataotere $a™%— oh, 


j=0 


we find that for a fixed value of j, the coefficient of zx” represents the number of 
ways in which r indistinguishable objects can be distributed among k cells with 
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at most m objects per cell wnder the condition that exactly j cells have exactly i 
occupants. Writing this coefficient as N(r, k, m\v; = 7), the marginal distribution 
of v; becomes 





6 Pein = sp = Nebo = 9 
If we let E[v{°] stand for the tth factorial moment of vj , we have 
(7) N(r, k, m)Elv}") = > j°N(r, k, m) |v; = 9), 
j= 
and the right-hand member of (7) is the coefficient of 2’ in 
(“) Pe tatat--) +a%- 7)? 


i K( + x + x + ae 4. a)****, 
where j°” and k™ are the tth factorial powers of j and k. We thus find that 
k°N(r — ti,k — t,m) 

N(r, k, m) 

Similarly, if we let N(r, k, m\vi, = ji , Vig = Je) stand for the number of ways 
in which r indistinguishable objects can be distributed among k cells with at 
most m objects per cell under the condition that exactly j; cells have exactly 7; 
occupants and exactly je cells have exactly i. occupants (with i; not equal to i), 
this quantity is given by the coefficient of zx” in the corresponding term of 


f(z; k,m) = [x +2%° + (1 +2+--- +2" —2 — x)} 


(8) Elv{?] = 


k k 7 = 
“ a >( : ) aturtaney tates + ae git) 9152 


ji=0 jo=0 jn Je 
Using the same steps as above, it may then be shown that 
N(r, k, m) . 
and higher factorial product moments can be obtained in the same way. 
The calculation of the restricted-occupancy coefficients needed in the evalua- 
tion of the moments of the v;, and subsequently of the mean and variance of s° 
[as defined in (1)] is greatly simplied by the use of the recursion relations 


Mtn, gp a dk gg 
(9) Elv? (t2)) ™ k N (r yt; lots, k ty — t,,m) 


t\v;,' 03," |) = ————__——_— 


(10a) N(r,k,m) = >> N(r — i,k — 1,m), 
t=0 

(10b) N(r,k —1,m) = >> _ — -N(r —i,k —1,m), 
tml 

or 


(10c) N(r,k,m) = N(r — 1,k, m) + N(r, k — 1, m) 
— N(r —m—1,k — 1, ™m). 
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These recursion formulas may be obtained directly by equating the coefficients 
of x’ in 


(11a) f(x; k, m) = f(x; 1, mf; k — 1, m), 
(11b) f(x; 1, m)f'(x; k, m) = k-f(a; k, m)f'(z; 1, m), 


or 


(11) (1 — x)f(x;k,m) = (1 — 2™™)f(x;k — 1, m). 


Equation (10c) may also be obtained as an immediate consequence of (10a). 

Suitable tables for an exact test of significance for m = 2 are in preparation. 
Tables of the restricted-occupancy coefficients N(r, k, 4) for values of k from 
1 to 20 and of r from 1 to 40 are given in [4]. It was also found experimentally 
for m = 2 and m = 4 that the chi-square criterion applied to the observed »v, 
and their expectations provides a good approximate test of significance. 
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A VECTOR FORM OF THE WALD-WOLFOWITZ-HOEFFDING THEOREM! 
By D. A. 8S. Fraser 
University of Toronto and Princeton University 
1. Summary. Hotelling and Pabst [1] showed that the rank correlation co- 
efficient had a limiting normal distribution under the equally likely permutations 
of the hypothesis of independence. Wald and Wolfowitz [2] developed a general 
theorem of this type, and Noether [3] and Hoeffding [4] have relaxed the con- 
ditions used therein. In this paper a vector form of the theorem is proved along 


the lines used in an example by Wald and Wolfowitz [1] but taking account of 
the singular cases in which the correlations approach one. 


2. The theorem. For each positive integer n let |/C(i,7)||, --- , |]Cnx(¢, 7)! 
be n X n matrices of real numbers. Also let (Ri , --- , R,) be a random variable 
which takes each permutation of (1, --- , n) with the same probability, (n!)™. 
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We shall be concerned with the limiting form of the joint distribution of S,. , 


. Sax, Where 


i=l 


First we find the means, variances, and covariances. Obviously 


, ee l1< ee 
(1) E\Sna} = - ZZ Coal’, 3): 
7S ¢. sank 


For the variances and covariances, it is convenient to adjust the matrices so 
that they sum to zero by rows and columns; we define 


aii ? l< le ; <i = 
(2)  dnali,j) = Caalt,j) — — 2. Caalk,j) — — Do Caalt, 1) + = D> Cralk, 2, 
nm 7 


N ken! ta=l 


kyle 


and notice that 


n 


(3) Sna — E{Sna} = >. dna(i, R,). 


i=l 


From the equations 


= dnalt j) 0 = ) 2 Guakt,.3)> 
t=] j=l 


it is straightfoward to prove that 


/ . ; ] ~ es sig 
(4). cov {Sra , Sas} = Dd dnali, j) dng(t, j). 
n— i,j=ml 
And in particular 
’ ait | i ata) 
(5) Var {Dna} - _ dnalt, J)- 
= i,j=mel 


We designate by pnas the correlation between S,. and Syg ; then we have 
Turorem. If for each a l,---,k 


, 


a re a 
— 2) dwa(i, j) 
a . N i,jml 

(6) lim 


ca 2 er oe i)| 
 ijjel 


then a necessary and sufficient condition that 


— = 0, r= 3,4,--- 


f Sn > E Sai i Sak ne E i Snx} ) 
r Y —— * r y yqi/2 
[Var {S,1}] [Var {Sx} 
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have a limiting k-variate normal distribution is that the correlations pnag approach 
limits as n — ~«. The limiting distribution has means zero, variances one, and 
correlations pag , where 


Pas lim pnas - 


no 


Condition (6) is satisfied if for alla = 1 


’ 
nm 


max d.,, (i, 9) 
d', a (i, D 


Proor. The proof is based on Hoeffding’s Theorem 3, which is essentially 
the theorem above with k = 1. We assume, without loss of generality, that for 
each a and for each n large enough that the denominator of (7) is nonzero, a 
scale factor has been applied to the C,a(7,7), dna(t, 7) to make Var {S,.} = 1. 
Also for simplicity we consider the case k = 2. 

Consider a linear combination, d,(i, 7) = adma(i, 7) + bdn2(i, j), and the cor- 
responding random variable 


n 


8, — E(8.) = >> dali, Rd 


t,j=l 


= alSn1 — E{Sna}] + b[Sn2 = 7 {Sno}. 


We check to see if this random variable satisfies the conditions of Hoeffding’s 
theorem. We have 


n 


L > lanl, I" S max { lal’, [B'} £ D> Udalé, | + Ide, DU 


ijl t,j=ml 


n 


< max { |al’, \b\'} = Do [|2dmli, 7)|” + |2dno(d, 9) 


t,j=l 


\b|"} |: 2 ldni(i, 9) |" 8 : bs \dno(t, ar | . 

1% i, jul a | 
Now if lim paiw(= py) exists, then the limit of Var S, exists. If this limit is zero, 
then §, — E{S,} converges in probability to zero; that is, it has a degenerate 
normal limiting distribution. If the limit of Var ‘§,} is greater than zero, then 
(8) implies that Condition (6) is satisfied. However, for this it is necessary to 
note that Condition (6) is equivalently given by the same expression but with 
absolute bars around d,,(7, 7); this was proved by Hoeffding in [4]. It follows 
then that 


8, — E(8,) = a[Su — E{Sma}] + [Sn — E{Sn2}] 


n2$j 


has a limiting normal distribution. This limiting distribution has mean zero and 
: 2 2 ¢ 
variance a + b + 2abpy. 
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Let (Y, Y’) designate a random variable having a bivariate normal distribu- 
tion with means 0, variances 1, and correlation py . Then the limiting distribu- 
tion of a(Sn — E{Sai}) + b(Sn. — E{Sy2}) is the distribution of aY + bY’ 
for all a, b. If we knew that (S,; — E{Sia}, Sno — E{Sn2}) had a limiting dis- 
tribution, say the distribution of a random variable (Z, Z’), then it would follow 
that the linear compound would have the distribution of aZ + bZ’. But this 
means that the random variable aY + bY’ is equivalent to aZ + bZ’ for all 
a, b. By Cramér [5], p. 105, this implies that the random variables (Z, Z’) and 
(Y, Y’) are equivalent. If a limiting distribution did not exist for 


f 


(Su — E{S 


1 y mY \ 
al}; Ont — 2 iSn2}); 


then we could extract on n two subsequences which have limiting distributions 
that are different. This contradicts the statement that the limiting distribution 
must be that of (Y, Y’). This proves the limiting normality when the correlation 
approaches a limit. 

If limn.o paz does not exist, then we can extract on n two subsequences with 
different limits. Then, by the argument above, the two subsequences of random 
variables would have limiting normal distributions which are different. This 
implies that the original sequence of random variables does not have a limiting 
distribution. The proof is completed. 
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ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Princeton meeting of the Institute, April 20-21, 1956) 


1. On Certain Stabilities of Sample Survey Response, (Preliminary Report), 
Davip RosEnsBiAttT, American University, (By Title). 


In economic and demographic surveys, studies of reporting behavior are sometimes 
undertaken through reinterviews of identical respondents using a. similar or more intense 
mode of inquiry. Stability of response is examined in the light of cross-tabulation data on 
first vs. second response. Assume: (1) there exist response observables a; , --- , a , crypto- 
states 6, ,--- , 8, , generally unobservable, which may, but need not, correspond to re- 
sponse observables; (2) L X r stochastic matrices #; , #2 , respectively, giving for each 
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crypto-state the conditional probabilities of response in first and second surveys; (3) an 
initial distribution h over crypto-states for some set of respondents; (4) an L X L stochastic 
matrix Q governing transitions among crypto-states for respondents in interval between 
surveys. In effect, the conditional distributions of second response depend directly only 
on second crypto-state and, in turn, the probabilities of entering second crypto-states 
depend directly only on first crypto-states. The expected joint distribution of first and 
second response is then given by C(h) = &,D(h }Q@,. , where diagonal matrix D(h) has h on 
main diagonal. A n.a.s.c. for the marginal distributions of C(h) to be equal, given any h, is 
that 6; = Q@, ; any C(h) is then necessarily symmetric. If one assumes crypto-states to be a 
fixed reference frame for historical truth, L = r, and @, = J, the identity, then an observable 
cross-tabulation may provide an estimate of response pattern #, . (Received March 12, 
1956). 


2. Some Multi-Level Continuous Sampling Plans, C. Derman, 8. LittaveEr, 
and H. Sotomon, Columbia University. 


Lieberman and Solomon (Ann. Math. Stat., Vol. 26(1955), pp. 686-705) introduced a 
multilevel continuous sampling plan which allows for any number of sampling levels sub- 
ject to the provision that transitions can occur only between adjacent levels. Some exten- 
sions of these plans are discussed. Specifically, (i) the situation where transition to levels 
having smaller sampling rates occurs one level at a time when quality is good and immediate 
transition to 100 per cent inspection occurs at any sampling level when quality is poor; 
(ii) the situation where transition to levels having smaller sampling rates occurs s levels at 
a time when quality is good and transition to levels having higher sampling rates occurs 
h levels at a time when quality is poor. The Average Outgoing Quality Function (AOQ) is 
derived for (i) for k levels, finite and infinite. For k infinite the Average Outgoing Quality 
Limit (AOQL) is derived and seems to have a simple relationship to the contours of equal 
AOQL already exhibited in the Lieberman-Solomon paper. For k = 1, the AOQ reduces 
to that of the Dodge plan (Ann. Math. Stat., Vol. 14(1943), pp. 264-279). For (ii), bounds 
are obtained for the AOQL. (Received March 12, 1956.) 


3. Applications of Vector-Valued Risk Functionals, (Preliminary Report), 
ALLAN BrrnBaAum, Columbia University. 


Lionel Weiss has shown (Ann. Math. Stat., Vol. 24(1953), pp. 677-80) that the methods of 
decision theory have natural extensions to the case in which the real-valued loss function 
is replaced by a vector, of which each component measures one aspect of the desirability of 
an outcome of a statistical decision problem. Adopting Weiss’s notation, a useful gen- 


eralization is obtained by replacing the risk-component yi. = [ (Diians(x)W ijn x)dF; 
Jz 
by any linear functional R.(n) of the decision function 7 = (m (x), --- , n1(xz)), and y* by the 
(vector-valued) linear risk functional R(n) = (Ri(n), --- , Ru(m)). If the class of allowed 
decision functions @ = {n} is convex (An“) + (1 — A)n® e Sif 7 € 6,0 <A <1), R(m) maps 
& onto a convex U-dimensional set S in which admissible points and complete-class subsets 
can be characterized in the usual ways. One application is a formal unification of Wald’s 
decision theory with the test theory based ona genera] form of the Neyman-Pearson lemma. 
This lemma is proved equivalent to admissibility of a point in S under general conditions. 
Let m:(x) = probability of rejectiong Ho: = 6) when z is observed, 


6 = (0,, --- 6), Ref) = m(x) dF (zx), R;(m) = 
Jz 


a 
58, Re(n) 
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Then R(n) = (Re, , Ri , Ru) leads to tests of Type A (taking ¢ = 1), and (taking t = 2 
R(n) = (Rey , Ri, Re , Riz , Ru , Rez) leads to tests of Type C. Other applications (for which 
Weiss’s formulation suffices) are made to mixed single-sample tests and double-sampling 
tests. (Received March 12, 1956). 


1. On Mixed Single-Sample Tests, Leonarp Conen, Columbia University 
(introduced by A. Birnbaum). 


Let X denote a random variable with cumulative distribution function F(z, 6), and con 
sider tests of a simple hypothesis Ho:@ = 6) against the simple alternative H,:0 = @, 
A single-sample (fixed sample-size) test is identified with a triple (a, 8, n) where a, 8 denote 
the probabilities of a type I and type II error, respectively, and n the sample size. A mixed 
single-sample test is a sequence of quadruples { (7; , ai, Bi, ni)}, where y: > 0, doo ¥ = l, and 
(a; , 8; ,n;) is a single-sample test, fori = 0,1,2, --- ; y: is interpreted as the probability of 
using the single-sample test (a; , 8in;). A mixed single-sample test will be identified with a 
triple (a, 8,n) = >0 vila , Bi, ni). A mixed single-sample test (a, 8, n) is admissible if for 
any other mixed single-sample test (a’, 8’, n’) either a’ > a, 8’ > B or n’ > n. Treating 
(a, 8, n) as a vector-valued risk function, a method for constructing a complete class of 
admissible mixed single-sample tests is developed and applied to tests on parameters of 
normal, binomial, and rectangular distributions. For each testing problem it is shown which 
fixed sample-size tests can be improved upon and by how much, by use of mixed tests. 
Examples of such improvements have been given by Kruskal and Raiffa. The method is 
also applied to confidence intervals evaluated in terms of average sample sizes, average 
lengths and confidence coefficients. (Received March 12, 1956.) 


5. On the Use of Randomization in the Investigation of an Unknown Func- 
tion, Ronert Hooke, Westinghouse Electric Corporation. 


Much of research in the physical sciences consists in the experimental study of a function 
of one or more variables. Experimental errors of a random kind are usually present, but 
often there are biases which are of equal or greater importance. In some cases of this sort 
the bias can be removed, as in the classical theory of design of experiments, by randomiza- 
tion. For example, if it is desired to estimate a definite integral, randomization implies 
taking observations at randomly selected (rather than equally spaced) points in the range 
of integration. Stratified sampling can be used, and a proper design can provide an unbiased 
estimate of the integral, togeher with estimates of error broken down into a part attribu- 
table to experimental error and a part attributable to the random sampling process. Sim- 
ilar observations may apply to the goodness-of-fit problem. (Received March 12, 1956.) 


6. Factorials in Near-Balanced Incomplete Block Designs for k(k — 1) Treat- 
ments, (Preliminary Report), C. Y. Kramer and R. A. Brap.Ley, Virginia 
Polytechnic Institute, (By Title). 


In this paper the adjusted treatment sum of squares in near-balanced incomplete block 
designs with k(k — 1) treatments is given as a function of the estimated treatment effects 
This new form is valid both when the blocks are grouped into replications in the form 
of the near-balanced rectangular lattice or the latinized rectangular lattice and when no 
grouping into replications is effected. By imposing factorial arrangements of treatments on 
these designs and by using orthogonal contrasts, the sums of squares for the factorial effects 
with one degree of freedom are found as functions of the estimated treatment effects. A 
complete analysis of variance for the factorials set in these designs is developed. If the 
factorial treatments are assigned to the blocks in a special manner, the sums of squares 
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for a (k — 1)-level factor are obtained free of block adjustment. This work is preliminary 
to a more intensive study of the analysis of factorial arrangements of treatments intro 
duced into incomplete block designs of various classes. That this can be done for any bal- 
anced incomplete block designs seems to be well known but not discussed in statistical 
literature. (Received March 12, 1956.) 


7. The Comparative Importance of the Order of an Observation in Determining 
Linear Estimates of the Mean and Standard Deviation of the Normal 
Distribution from Censored Samples, N < 10, A. E. Sarwan and B. G. 
GREENBERG, University of North Carolina. 


Tables were calculated for linear estimation of the mean and standard deviation in 
censored samples £10 from a normal population in previous work. The coefficients and 
relative efficiencies in these tables present certain patterns which shed light on the relative 
contributions made by observations dependent upon order in the sample. For example, for 
the known sample elements after censoring, the relative contribution which undergoes 
maximum change in estimating the mean and standard deviation is that associated with the 
extreme known sample elements. Also, for a fixed sample size and fixed number of censored 
elements from the left, the importance of the largest known element increases whereas 
that of the smallest known element decreases as the number of censored elements on the 
right increases. 

The table for relative efficiencies shows that the efficiencies of the estimate of o¢ drop 
more rapidly than that of u. This table also shows that for fixed n and fixed uncensored 
sample size, the efficiency of the best estimate of ¢ is constant independent of censored 
elements from the right or left. 

Furthermore, results show that, in estimating the mean, a single central value is worth 
more than half the sample, while this is the reverse in estimating ¢. (Received March 12, 
1956.) 


8. Relations Between Stochastic Processes, BAYARD RANKIN, Massachu- 
setts Institute of Technology. 


For each t = 0, X(t) and Y(t) are random variables defined on (Q, @, P). @(X(t)) is the 
Borel field induced on 2 by X(t) and P*B is the conditional probability of B w.r.t. X(t). 
The separable stochastic processes X = {X(t),¢ 2 O}, Y = {Y(t), t = O} are related if 
@(Y) c @(X). (All following statements hold for all t > r = 0, B e @(Y(t)).) We investi- 
gate the special relation defined by (1) @(Y(t)) C @(X(t)) and (2) PYB = PXB., if 
Y~ is any stochastic process such that Y~(t) induces the same Borel field as {Y(r),0 S 7 St}, 
then X = Y~ satisfies (1). The following is basic to this paper: THEorEemM: Let X, Y satisfy 
(1). PYOB= PY Band PY B= PX ©B if and only if PYOB = PXB and PXB = 
PX “B. This theorem has the logical form AB if and only if CD. It is proved that neither 
A, B, C, nor D imply one another (the theorem is strong) and there exist X, Y satisfying 
(1) and AB (the theorem is nontrivial). The former follows from the Taeorem: For each of 
the conditions ACD, ADC, BCD, BDC there exist X, Y satisfying it as well as (1), where 
D = not D. This work was supported in part by the Office of Naval Research. (Received 
March 12, 1956.) 


9. The Two Sample Multivariate Problem in the Degenerate Case, (Pre- 
liminary Report), A. P. Dempster, Princeton University. 


Samples of n; and n, individuals are available from 2k-variable normal populations with 
common variances and covariances but possibly different means. When m + nz — 2 < k, 
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the classical methods for assessing population separation fail, so this may be called the 
degenerate case. It appears necessary to give up the desirable property of invariance of the 
methed under all linear transformations of k-space, and so the present method depends on 
an original choice of units. In k-space vectors V» joining the sample means and V,; joining 
the rth sample mean to the jth point of the rth sample (r = 1, 2; 7 = 1,2, --- , my) are de- 
fined. By transforming on individuals n; + nz — 2 linear combinations of the V,; can be 
found which are independently and identically distributed and, under the null hypothesis 
H of equal means, independent of and distributed as Vo . Deviations from H will show by 
Vo being ‘“‘too long’’. The distribution theory of vector lengths and other geometrically 
motivated statistics is described and approximations given. Corresponding approximate 
tests are discussed, some based only on lengths and others seeking to use more of the infor- 
mation present. (Received March 12, 1956.) 


10. A Test for Independence in Contingency Tables, (Preliminary Report), 
James G. C. TEMPLETON, Princeton University. 


It is well known that the Pearson x? test for independence in contingency tables lacks 
power when used as a test against certain specific alternatives. A test is proposed, based 
upon ‘‘missing-plot’’ estimates of the expected frequencies, which is believed to be more 
satisfactory than the x? test when the departure from independence is due to disturbances 


confined to a few cells of the table. (Received March 12, 1956) 


11. An Extension of a Method of Making Multiple Comparisons, (Preliminary 
Report), Tuomas E. Kurtz, Princeton University. 


A method of Tukey [Proc. Fifth Ann. Convention, Amer. Soc. for Quality Control, (1951), 
p. 189.] is extended to include the treatment of data having unequal variances. The allow- 


ance for comparisons of the form (Y; — Y;),i,j7 = 1,2, +--+ ,n, is qa(n, fs, (gi + g})/2, 
where Var(Y;) = gio’, s7 is the usual unbiased estimate of o”, and qa(n, f) is the ath per- 
cent point of the studentized range of n based on f degrees of freedom. A norm for contrasts 
is obtained such that the resulting statements about contrasts follow as logical conse- 
quences from the set of statements about comparisons. The error rate of this extension is 
shown to be Sa for n = 3 and for certain special values of the {g;}. The implication is 
that the error rate is always Sa. The procedure would thus be conservative. When com- 
pared with a method of Scheffé [Biometrika, Vol. 40 (1953), p. 87.], the extension is shown to 
provide intervals at least as short for unequal variances as the unmodified Tukey method 
does for equal variances. (Received March 6, 1956.) 


12. On the Power of Some Rank Order Two-Sample Tests, Joan Ravup 


ROSENBLATT, National Bureau of Standards. 


In a given class D of pairs of distributions (F, G), the null hypothesis F = G is to be 
tested against a wide class of alternatives D, C D. Among rank-order tests considered 
are the Mann-Whitney test, a two-sample test proposed by Lehmann (1951), and others. 


The power of a test is considered as a function of 6 = [r dG, A? = | (F — G)? d(F + @G), 


and other functionals of the pair (F, G@), under the condition that D; contains only pairs of 
continuous distributions. Also considered is the symmetrical problem of deciding between 
two subsets D) , D, of D when each is defined in terms of values of 6(F, G). Comparisons 
among the various tests with respect to power or efficiency are made for several ranges of 
sample sizes. (Received March 21, 1956). 
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13. A Note on Ranking Means, W. A. THompson, Jr., Fort Bliss, Texas. 


Many methods have been proposed for ranking means in analysis of variance and it has 
been suggested that the method to use may depend on the nature of the loss function. The 
results of this paper seem to indicate that for one plausible type of loss function the ‘“‘old 
fashioned” least significant difference test is appropriate. (Received March 13, 1956). 


14. Multivariate Ratio Estimation for Finite Populations, Inaram OLKIN, Uni- 
versity of Chicago. 


In sample surveys, precision in estimating the unknown population mean Y may be in- 
creased by using an auxiliary variate X, which is positively correlated with Y and whose 
mean X is known. Two such procedures are ratio and regression estimation. This paper is 
concerned with a multivariate extension of ratio estimation. Let (Y, X: , --- , X,) denote 
a finite multivariate population of N elements, where the means Xi , --- , Xp are known. 
On the basis of a sample (y; , 21; , --- , Zpi) i = 1, «+ , n, Y is to be estimated. The pro- 
posed ratio estimate is a linear combination of the individual ratio estimates: y = wiriX) 
+ +--+ + wprpX,, where r; = y | % ,andw = (w,, --+ , wp) is a weight function. The opti- 
mum weight function which minimizes the approximate mean square error is given by 


w = e(A + b’b)“ | e(A + b’b)~e’, 


where e = (1, «+» ,1),b: = Vn(Ci — poiCoCs) (i = 1, -+- , p), A:p X p, ai; = C3 — polo; 
— pojCoC; + pi;CiC; ; Ci is the coefficient of variation of X;(Y = 20), pi; is the correlation 
between X; and X;. 

Estimates of the approximate variance and mean-square error of 7 and some comparisons 
of estimation procedures are given. Also, an extension to stratified sampling and the ap- 
proach to normality are considered. (Received March 9, 1956.) 


15. Runs Above the Sample Mean, Hrerserr T. Davin, University of Chicago. 


Let R@ be the number of runs of length d above the sample mean, in a sample of n 
from a normal population. The distribution of R& is computed in closed form for n:4, 5. 
Formulas are given for evaluating the distribution of R® for all finite n. The evaluation of 
this distribution amounts essentially to solving a problem (the Demon Problem) posed by 
J. Youden. Asymptotic upper and lower bounds are also given for P{R®’ = 1}, n large. It 
is further shown that R\ is asymptotically normal, with asymptotic variance V, given by 
the equation Wz — Va = (2/x)(Wa — Ua), where Wz and U4 are, respectively, the asymp- 
totic variances of runs of length d above the population mean and population median, as 
given by A. M. Mood. (Received March 9, 1956). 


16. Approximations to the Power of Rank Tests, Cura Kure: Tsao, Wayne 
University, (By Title). 


The paper proposes a method for approximating the distribution of the ranks, which is 
the basis for evaluating the power of an arbitrary rank test. Let Fo, --- , Fx be k + 1 con- 
tinuous edf’s. Let Zi: , --- , Zim; be the ordered results of a random sample drawn from 
F;(z),i = 0,--- ,k. Let 6 = (0 ,---, Bom, 5 *** » Ok, *** 5 Oem.) be the ranks of 


Z = (Za, -+ , Zomy s °° Zany *** » Zim) 
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Let T(z) be a continuous, strictly increasing function such that 7(—#) = 0, T(#) = 1 
Let $;(v) = @;(T(z)) = F,(z), i 2 0, --- ,k. Let Va, --- , Vi n, be the ordered results of a 
random sample drawn from the¥edf #;(v), i = 0, --- , k. Let 
Y = (yor: °°* 5 Yome °°" » Yi »*** 9 Ykm,) 
be the ranks of V = (Vo, --:, Vom, ,*** » Vix, *** » Vim,)- Then the distribution of 6, 


say P(R), is identical with that of y, and hence, according to a throem due to Hoeffding 
|Proceedings of the Second Berkeley Symposium on Math. Stat. and Prob., 1951, p. 88], it can 
be expressed as a multiple integtal of essentially the product of #;(v), the derivatives of 
®;(v). Since, for many known cdf’s-F;(z), the functions ,;(v) are of unknown form, the author 
proposes the use of interpolatin: polynomials Q;(v) as approximations to #;(v). Conse 
quently, approximate values of ?(R) can be obtained by elementary integrations when 
#;(v) are replaced by Q;(v), the d>rivatives of Q;(v). For the case k = 1, explicit formulae 
for approximating P(R) by this réethod are given. As illustrations, a few tables are calcu- 
lated for the normal alternatives. The same approximation procedure may be used to obtain 
the large sample power of certain rank tests using large sample theory. The asymptotic 
power efficiency of a class of rank tests is also investigated. (Received March 9, 1956). 


17. On Maximizing and Minimizing a Certain Integral with Statistical Applica- 
tions, JAcpisu S. Rusraai, Carnegie Institute of Technology. 


The problem considered in this paper is that of minimizing and maximizing 


x 
[ @ (x, F(x)) dr under the assumptions that F(z) is a cumulative distribution function 


edf) on [—X, X] with the first two moments given and ¢ is a certain known function 


having certain properties. The existence of the solution has been proved and a characteriza- 
tion of the minimizing and maximizing cdf’s given. The minimizing cdf is unique when 
(xz, y) is strictly convex in y and is completely characterized for some special forms of ¢. 
However, the maximizing cdf is a discrete distribution and in the above case turns out to 


be a three-point distribution. Using a technique due to Karlin, the minimum problem has 
x 


been reduced to that of minimizing [ (d¢(x, Fo(x))/dy + 9: + nox] F(x) dz over the class of 
J_x 


all edf’s on [—X, X], where F(x) is the minimizing cdf and 7; , 72 are constants. This is a 
problem linear in F(z) and simpler to deal with. An interesting result proved is that 


x:0o(x, Fo(x))/dy + m + mz, —X < 2x < X} 


has Fy-measure zero. Results of Gumbel and of David and Hartley (Ann. Math. Stat., 1954) 
have been obtained as special cases of the above problem and some other interesting sta 
tistical applications are discussed. (Received March 9, 1956.) 


18. Relations Between Stochastic Variables, GeRHARD TINTNER, lowa State 
College. 


Information theory permits a unified treatment of various types of multivariate analysis 
(Kullback). It is possible to integrate weighted regression and the limited information 
method into this approach. A related method is the method of Theil. Other methods of ob- 
taining estimates of relations between stochastic variables are: Instrumental variables, 
minimum distance methods, nonparametric methods. The problem of identification (unique- 


ness( is discussed and also difficulties arising because of autocorrelation and serial correla- 
tion. (Received April 19, 1956). 
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19. On the Distribution of the Likelihood Ratio, Ropert V. Hoaa, State Uni- 
versity of Iowa. 


Consider the k populations having probability density functions Q(6;)P(x), a(@;) S x 
<= b(0;), zero elsewhere, i = 1,2, --- ,k, where P(z) isa real single-valued positive continu- 
ous function of z and either (1): a(@) equals a constant and b(@) is a strictly monotone 
continuous function; or (2): a(@) equals @ and b(@) is a strictly monotone decreasing con- 
tinuous function. We test, by use of the likelihood ratio \, the hypothesis that 


(0: , @2, °** , Ox) 


belongs to a certain type of m-dimensional subset of the k-dimensional parameter space 
against all possible alternatives. Under the null hypothesis it is proved that —2 In \ has an 
exact x? distribution with 2(k — m) degrees of freedom. (Received March 13, 1956). 


20. Quadratic Time Homogeneous Birth and Death Processes, (Preliminary 
Report), Perer W. M. Joun, University of New Mexico, (By Title). 


Time homogeneous birth and death processes, in which \, = (nm? + an) and 
Bn = p(n? + an), 


where X, uw, @ are constants with a = 0, and n(0) = 1, are considered. A necessary and suffi- 
cient condition that such a process shall be divergent, in the sense that there is a positive 
probability of an infinite number of events occurring in finite time, it seen to be that \ > uz. 
The differential equations for the probability generating function and the cumulant gen- 
erating function for n(t) are obtained. On equating coefficients the latter equation yields a 
series of differential equations, which can be solved for the individual cumulants if, and only 
if, the process is a balanced one, i.e.,\ = uw. In this case the moments of n(t) and of the total 
population count M(t) are easily calculated. (Received March 21, 1956). 


21. Seasonal Forecast of Some Time Series, (Preliminary Report), Josern V. 
Tauacko, Marquette University. 


Stationary time series with the predominant seasonal fluctuation, with time observa- 
tions (like reported cases of infectious diseases, etc.): yu, Yi2, *** 5 Yir 3 Yar, °°" » Yor 

* Yij *** Yar , Where n is a number of years and r a number of seasonal equidistant inter- 
vals, offer a reasonable forecasting of the most probable total observations, with arbitrary 
confidence intervals soon in the season. For sufficiently large n, the average cumulative 
relative frequencies may be graduated and the graduation function G(t) used as the cumula- 
tive probability function. The seasonal random variations from the G(t) obey the Poisson 
law as a function of the density g(t), so the confidence intervals of the seasonal forecast 
may be found for any 0 < ¢ < 2x from a simple recurrent formula. The Fournier or Logit 
regression may be applied with some advantages. Application on the morbidity data of the 
poliomyelitis in the United States, based on weekly and monthly observations since 1920, 
is introduced. (Received March 21, 1956). 


22. Confidence Intervals for Variance Ratios Specifying Genetic Heritability, 
FRANKLIN A. GRAYBILL, Oklahoma A and M College. 


Consider the twofold analysis of variance model with equal subclass numbers, yijz = 
wt a: + bi; + cise , where yijz is the observation, yu is a fixed constant, and the a; , bi; , 
and c;;, are independent normal variables whose means are zero and whose variances are 
o, , 05, and o- respectively. 





Ph ee ng ast ae 
Paes zh 


ABSTRACTS 551 


In some fields of endeavor (especially in genetics) the above model occurs quite fre- 
quently, and an estimate of the quantity, h? = 2(0% + o3)/o% + 05 + a2, is ofttimes desired. 
In this paper a method is presented for obtaining an approximate confidence interval for h? 
which is quite accurate even for very small sample sizes. The method is patterned somewhat 
after the method used by Satterthwaite to obtain the approximate distribution of linear 
functions of Chi-square variates. (Received March 30, 1956). 


23. A Method of Multiple Rank Correlation, H. Torri, University of Chicago 
and Netherlands School of Economics (introduced by D. L. Wallace). 


Consider three rankings 2; , 11 , To (¢ = 1, --- , n), all permutations of the first n inte- 
gers, the latter two being fixed. The null hypothesis is that the z)-ranking is independent of 
the other two, with equal probabilities for all n! possibilites. For this purpose the following 
coefficient of multiple rank correlation is proposed: 7'o.12 = (tor + toz)/(1 + 712), tor being 
the Kendall rank correlation of 2» and 2; , 712 the (fixed) rank correlation of z; and Zz , etc. 
To.12 is obtained by scoring according to Kendall for the two rankings 2x» and 2 , and for 
the rankings 2 and zz , adding the scores, and dividing by the attainable maximum, given 
the fixed 2,- and 22-rankings. 


We have —1 S 75.12 S 1 and, under the null-hypothesis, ET)... = 0 and var 7.12 = 
n > g ~ . . 
(4) C ) { (mn +1)(1 + pe) + (3) (1 + ri2)} /{1 + ri2}?, pi2 being the Spearman rank correlation 


for x; , : . The distrubution of 7'o.12 under the null hypothesis is asymptotically normal. 
The further analysis includes a discussion of the relation between 7.12 and Kendall’s 

coefficient of partial rank correlation, a discussion of Moran’s coefficient of multiple rank 

correlation, and a generalisation for more variables. (Received March 30, 1956). 


24. Moments of a Test Criterion for Outliers, P. A. Knys, lowa State College. 


We assume a random sample, 2 , --- , 2, , of size n from N(0, 1) arranged in ascending 
order of magnitude so that z; is the smallest and z, the largest observation. It is sometimes 
of interest to test whether z, is in fact the largest observation in a random sample or whether 
it ought to be discarded as an outlier. Thompson, Ann. Math. Stat., Vol. 6 (1935) pp. 214- 
219, and Pearson and Chandra Sekar, Biometrika, Vol. 28 (1936), pp. 308-320, have consid- 
ered as a test criterion the ratio u = (z, — %)/s where s? is the sample mean square, s? 
= 2(x; — #)/n — 1). By aspecial argument Pearson and Chandra Sekar were able to derive 
the extreme upper percentage points of u. Formulas for the moments of u have been derived 
in this paper. Since u is distributed independently of s, the moments of u can be derived 
from those of (z, — #) and s. A specially adapted characteristic function technique, previ- 
ously used by McKay, Biometrika, Vol. 27 (1935), pp. 466-471, was employed to obtain 
formulas for the moments of (z, — %). McKay’s technique links the moments of (z, — Z) 
to those of z,. The latter have recently been tabulated by Ruben, Biometrika, Vol. 41 (1954), 
pp. 200-227. The derivation and tabulation of these moments of the extreme deviate permits 
an approximation to the distribution by a Pearson type fit to the first four exact moments. 


25. Post Stratification in Multistage Sampling, Wiu1am H. WriiuiAms, Iowa 
State College. 


Let ;M denote the number of units of a population falling into the ith stratum of a num- 
ber (L) of strata (¢ = 1, 2, --- L) and ;Y denote the total of the characteristics ;y,(t = 1, 
2, --- «M). Their mean is ;Y = «¥/;M. 

In certain situations one cannot determine the exact stratum to which a unit belongs 
until after it has been sampled. In these situations a device known as post stratification is 
sometimes used, consisting of the following steps—(a) Draw a random sample of size m. (b) 
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Denote by ig the mean of the ,;m units which happen to be sampled in the ith stratum. (c) 
Estimate the population total by ¥ = 24: :Mig’ where «y’ = wif im > 0 and i’ = 0 if 
im = 0. The approximate variance formula for this estimator can be found in the literature. 
Here a new approach is used to derive these formulas. This method, which uses the ratio 
estimator ;Y of the strata mean ,Y can be generalized to cover post stratification in any 
sampling design however complex. The result is as follows: Let ¥(:y:) denote the estimate 
of the total for the particular sampling scheme which has been used and V(;y;) denote its 
variance. Then the estimate of the population total is D4; ;Mij and its variance is V(;y;) 
where Yi = wy: — < if :y: is in the ith stratum. 

This approach permits consideration of different systems of post stratification with 
regard to precision. (Received April 20, 1956). 
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NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 
Personal Items 
The Degree of Doctor of Laws was conferred on Harold Hotelling by the 
University of Chicago on 11 November, 1955, at the convocation in celebra- 
tion of the twenty-fifth anniversary of the University’s Social Science Re- 


search Building. Dr. Hotelling was cited as the “foremost contemporary con- 


tributor of quantitative methods to the social sciences, who by mathematical 
analysis has notably advanced our understanding of fundamental problems in 
economics and in statistics.” 


I 


George E. Auman has been appointed Assistant Chief of the National Bureau 
of Standards Management Planning Division, where he will assist in Bureau’s 
management program. 

Paul M. Blunk has left Operations Research Group, Convair, Fort Worth, 
Texas, to become Chief, Reliability Group, Aerojet General Corp., Sacramento, 
Calif. He will apply principles of probability and mathematical statistics to 
the field of rocket reliability. 

Derrill J. Bordelon received an M.A. in Mathematics from the University of 
Maryland, February 1, 1956. 

William Fuller Brown, Jr., formerly of the Physical Laboratory of the Sun 
Oil Company, Newtown Square, Pa., has accepted a position as a Senior Phys- 
icist in the Central Research Department of Minnesota Mining and Manu- 
facturing Company, St. Paul, Minnesota. 

Jack Chassan has accepted the position of Chief Statistician at Saint Eliza- 
beth’s Hospital in Washington, D. C. 

Dr. Willard H. Clatworthy has accepted a position as Staff Statistician with 
Westinghouse Electric Corporation, Atomic Power Division, Bettis Plant, 
Pittsburgh 30, Pa. 

Mr. John L. Dalke has been appointed chief of the High Frequency Imped- 
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ance Standards Section, Radio Standards Division, Boulder Laboratories, Na- 
tional Bureau of Standards. 

Miss Lila Elveback has been appointed Associate Professor of Biostatistics 
in the Dept. of Tropical Medicine and Public Health at the School of Medicine, 
Tulane University in New Orleans, Louisiana. 

The Royal Society has announced the award of the Copley Medal to Sir 
Ronald Fisher, F.R.S., Arthur Balfour Professor of Genetics, Cambridge Uni- 
versity, for contributions to developing the theory and application of statistics 
for making quantitative a vast field of biology. 

David W. Gaylor has accepted a position with the Nuclear Research and 
Development Department, Convair Aircraft Co., Ft. Worth, Texas. 

Walter M. Gilbert is now Assistant Professor of Mathematics, Iowa State 
College, Ames, Iowa. 

Mr. Robert Gran, formerly of Evanston, Illinois, recently moved to Palatine, 
Illinois. He is currently employed as statistician at the Cook County Highway 
Department in Chicago. 

Georges Th. Guilbaud, formerly Deputy Director of I.S.E.A., has been ap- 
pointed as Directeur d’Etudes a |’Ecole pratique de Haute Etude (Paris, Sor- 
bonne)—methodes mathematique dans le science sociale. 

Clifford Hildreth resigned his position as Professor of Agricultural Economics 
at North Carolina State College in August, 1955, to accept a similar job at 
Michigan State University. 

W. Robert Hydeman, formerly of the Electronic Computer Department of 
temington Rand has joined The Ramo-Wooldridge Corporation as member of 
the Technical Staff. Mr. Hydeman is situated in the Cleveland office and is 
associated with the Management Science Group of the Computer System 
Division. 

Dr. H. 8. Konijn has resigned his Lectureship at the University of California 
in Berkeley to become Lecturer in Economic Statistics in the University of 
Sydney, Australia. 

Roy R. Kuebler, Jr., formerly Associate Professor of Mathematics at Dickin- 
son College, has been in the Office of the Chief of Ordnance, Department of the 
Army, Washington 25, since July. He holds a position of Mathematician in 
the Design of Experiment Unit, Research and Developmcnt Division. 

Robert J. Kurland has begun postdoctoral research in microwave spec- 
troscopy at the National Bureau of Standards. He is one of seven young scien- 
tists to be selected for the Postdoctoral Research Associateship program 
sponsored by the National Academy of Sciences-National Research Council 
and the Bureau. 

Andre G. Laurent, formerly Head of Prices and Quantitative Economic 
Analysis Sections, Department of Economics, National Institute of Statistics 
and Economics, Paris, France, and Professor at the 8.Q.C. Center of the In- 
stitute of Statistics of the University of Paris, after spending the academic 
year 1953-1954 as a Post-Doctoral Fellow at the Committee on Statistics of 
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the University of Chicago, has joined the staff of the Department of Statistics, 
Michigan State University. 

Duncan C. McCune, formerly quality control statistician for the Tubular 
Products Div. of The Babcock and Wilcox Co., has accepted the position of 
Staff Statistician with Jones and Laughlin Steel Corporation in Pittsburgh, Pa. 

Ralph A. Maggio, formerly Lecturer at Rutgers University, New Brunswick, 
New Jersey, has accepted the position of Quality Control Engineer, Red Bank 
Division, Bendix Aviation Corporation, Eatontown, New Jersey. 

Miss Margaret Pearl Martin is on leave from her position at Vanderbilt 
University and is spending the academic year at the University of Chicago, 
Committee on Statistics. 

John Winston Mayne, who for the last two years has been Senior Operations 
Research Officer of the Joint Services Operational Research Team at Tactical 
Air Command Headquarters in Edmonton, Alberta, has recently been ap- 
pointed Director of Operational Research, Navy, at National Defense Head- 
quarters, Ottawa. 

Harold Glazer has recently been appointed Senior Engineer in the Analysis 
Department of the Waltham Laboratories, Sylvania Electric Products, Inc. 
His assignment covers the application of methods of mathematical statistics 
to the design of programs of machine calculations. 

Mrs. Mary G. Natrella has rejoined the Applied Mathematics Division of 
the National Bureau of Standards where she will serve on the staff of the Sta- 
tistical Engineering Laboratory. 

Ingram Olkin, Associate Professor of Statistics, Michigan State University, 
is on leave during 1955-1956, and has been appointed to a visiting associate 
professorship in the Committee on Statistics, University of Chicago for this 
period. 

Dr. K. C. Sreedharan Pillai has been deputed for a year by the United Na- 
tions Technical Assistance Administration to the Philippines. 

John W. Pratt has completed his Ph.D. in Statistics at Stanford University 
and is now a Research Associate on a Navy Contract at the University of 
Chicago. 

Bruce P. Price, formerly of Southwest Research Institute, San Antonio, 
Texas, is now statistician in an Operations Research Group at the Phillip Pe- 
troleum Research and Development, Bartlesville, Oklahoma. 

Professor Henry Quastler has left the University of Illinois and accepted a 
position as Senior Radiobiologist at the Brookhaven National Laboratory. 

Dr. Bernard J. Ransil has begun postdoctoral research at the National Bureau 
of Standards. He is one of seven young graduate scientists to be selected for 
the Postdoctoral Research Associateship program sponsored by the National 
Academy of Sciences-National Research Council and the Bureau. Dr. Ransil is 
interested in the quantum statistical treatment of isotope effects. 

Mrs. Joan Raup Rosenblatt, a mathematical statistician, has joined the 
Statistical Engineering Section of the National Bureau of Standards’ Applied 
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Mathematics Division. She will be primarily concerned with the theory and 
application of nonparametric techniques of statistical inference. 

Murray Rosenblatt has been appointed to an Associate Professorship in 
Mathematics at Indiana University, Bloomington, Indiana. 

Melvin E. Salveson has transferred from Manager-Business Research, General 
Electric Co., Louisville, Kentucky, to Consultant-Operations Research, Gen- 
eral Electric, New York City. 

George W. Snedecor is now on a five-month assignment in Campinas, State 
of Sao Paulo, Brazil, supported in part by the Rockefeller Foundation. He will 
return to Iowa State College in June. His objectives are to consult with re- 
search workers on the design and analysis of agricultural experiments, mainly 
at the Institute of Agronomy in Campinas, and to confer on the organization 
of a Center for teaching and research in experimental statistics. The latter is a 
joint project of the Ministries of Education and Agriculture, State of Sao Paulo, 
with the Institute of Agronomy, the University of Sao Paulo and the College 
of Agriculture at Piracgicaba cooperating. 

Major General Leslie E. Simon, for the past six years Assistant Chief of 
Army Ordnance for Research and Development, has retired from the Army 
and is now Director of Research and Development, The Carborundum Com- 
pany, Niagara Falls, New York; Member of the Board of Directors, the Gruen 
Precision Laboratories, Cincinnati, Ohio; and Member of the Board of Trustees, 
the Illinois Institute of Technology, Chicago, Illinois. 

Arthur Stein, a U. S. Army Ordnance specialist in ballistic quality control 
for ten years, has joined Cornell Aeronautical Laboratory, Inc. as principal 
research engineer in the Systems Research Department. 

Dr. Balkrishna V. Sukhatme has obtained his Ph.D. degree in Mathematical 
Statistics, University of California, Berkeley in June, 1955, and is presently 
employed as Senior Research Statistician by the Indian Council of Agricultural 
Research, New Delhi. 

Dr. Daniel Teichroew is now with the National Cash Register Company, 
Dayton, Ohio, as Senior Electronic Applications Specialist in the Electronic 
Sales Department. 

Dr. John G. Thompson, Chief of the Metallurgy Division of the National 
Bureau of Standards, has retired after more than 35 years of government service. 

Ronald E. Walpole is now doing graduate work at the Virginia Polytechnic 
Institute, Blacksburg, Virginia. He was formerly with the Defense Research 
Board in Ottawa, Canada. 


re 


SOCIOMETRY 


The American Sociological Society, publisher of the American Sociological 
Review, announces the publication of Soctometry: A Journal of Research in Social 
Psychology. Founded in 1937 by Dr. J. L. Moreno, this quarterly journal will 
become an official publication of the Society with the March 1956 issue. 
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Summer Offerings in Statistics at Iowa State College 


The 1956 summer offerings of the Department of Statistics at Iowa State 
College will be given in two sessions: June 11—July 18 and July 18—August 24. 
Students may register for either six-week session or for the full 12-week sum- 
mer quarter. The offerings include six courses at the advanced undergraduate 
and graduate minor level and are designed to satisfy background requirements 
for graduate major work in statistics. Senior members of the staff will be avail- 
able during most of the summer for consultation on graduate research and 
special problems courses. The complete program consists of Stat. 401 and 402, 
“Statistical Methods for Research Workers,” offered in sequence; Stat. 447 
and 448, “Statistical Theory for Research Workers,” offered in sequence; Stat. 
411, “Experimental Designs for Research Workers,” and Stat. 421, “Survey 
Designs for Research Workers,’’ both given in the second session; Stat. 599, 
“Special Problems,” and Stat. 699, ‘‘Research,’’ both elective either session. 
Additional information may be obtained from T. A. Bancroft, Director, Statis- 
tical Laboratory, Iowa State College. 

a 
Civil Service Examinations 


There is an urgent need for Chemists, Mathematicians, Metallurgists, Phys- 
icists, and Electronic Scientists in the Washington, D. C., area, the United 
States Civil Service Commission has announced. Vacancies are in various 
Federal agencies and pay salaries ranging from $4,345 to $11,610 a year. Further 
information and application forms may be obtained at many post offices 
throughout the country, or by writing to the U. 8. Civil Service Commission, 
Washington 25, D. C. Applicants should ask for Announcement No. 46(B). 
Applications will be accepted by the Board of U. 8. Civil Service Examiners, 
National Bureau of Standards, Washington 25, D. C., until further notice. 


ce 


Doctoral Dissertations in Statistics, 1955 


Listed below are the doctorates conferred during the year 1955 in the United 


States and Canada for which the dissertations were written on topics in statis- 
tics or related fields. The university, major subject, and the title of the dis- 


sertation are given in each case. Readers are invited to notify the Editor of any 
omissions from this list. 

Gordon C. Ashton, North Carolina State College, major in statistics and 
animal industry, “The Utility of Certain Concomitant Observations in Reduc- 
ing the Variation in the Rate of Gain in Body Weight of Swine.” 

J. Y. Barry, Yale, “Generation of Abstract Markov Processes.” 

Patrick Paul Billingsley, Princeton, major in probability, “The Invariance 
Principle for Dependent Random Variables.” 

Charles Boll, Stanford, major in statistics, “Comparison of Experiments in 
the Infinite Case and the Use of Invariance in Establishing Sufficiency.” 
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Helen Bozivich, Iowa State College, major in statistics, “Power of Test 
Procedures for Certain Incompletely Specified Random and Mixed Models.”’ 

Donald L. Brakensiek, Iowa State College, major in statistics and agricul- 
tural engineering, “Estimation of Surface Run-off Volumes from an Agricul- 
tural Watershed by Infiltration Theory.” 

Samuel H. Brooks, Johns Hopkins, major in biostatistics, ‘Comparison of 
Methods for Estimating the Optimal Factor Combination.” 

Edward Clark Bryant, Iowa State College, major in statistics, ““An Analysis 
of Some Two-way Stratifications.”’ 

Donald L. Burkholder, North Carolina, major in statistics, “On a Certain 
Class of Stochastic Approximation Processes.” 

Raymond Collier, Minnesota, major in statistics, “Experimental Designs in 
Which the Observations are Assumed to be Correlated.” 

John W. Coy, Michigan, major in mathematical statistics, ‘‘A Differential 
Calculus in a Matrix Algebra.” 

L. E. Dubins, Chicago, “Generalized Random Variables.” 

Frances E. Dunn, Harvard, major in education, ‘‘Guiding College Students 
in the Selection of a Field of Concentration.” 

Lillian Elveback, Minnesota, major in statistics, “Some Aspects of Estima- 
tion Problems in Follow-Up Studies in Chronic Disease.” 

Jay Ernest Folkert, Michigan State University, major in mathematical 
statistics, “The Distribution of the Number of Components of a Random 
Mapping Function.” 

Rudolf J. Fruend, North Carolina State College, major in statistics, ‘‘In- 
troduction of Risk into a Linear Programming Model.” 

Charles E. Gates, North Carolina State College, major in statistics, ‘“The 
Constitution of Genetic Variances and Covariances in Self-Fertilized Crops 
Assuming Linkage.” 

Seymour Geisser, North Carolina, major in statistics, “On the Exact Dis- 
tribution of Certain Statistics Related to Mean Square Successive Difference.” 

William J. Hall, North Carolina, major in statistics, ‘““Most Economical 
Multiple-Decision Rules.” 

Irwin Guttman, Toronto, “Characterization of Tolerance Regions.” 

John Hamblen, Purdue, major in mathematical statistics, ‘‘Distribution of 
Roots of Algebraic Equations with Variable Coefficients.” 

William C. Healy, Jr., Illinois, major in statistics, “(Optimum Invariant 
Estimation.” 

John Frederick Hofmann, Iowa State College, major in statistics, “Life 
Testing in Controlled Environmental Conditions.” 

tobert Wakely Kennard, Carnegie Institute of Technology, major in mathe- 
matics, “On the Determination of Significance Levels of Control Charts for 
Continuous Variates.”’ 

Ernest G. Kimme, Minnesota, major in mathematics, ‘On the Convergence 
of Sequences of Stochastic Processes with Independent Increments.” 
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William H. Kruskal, Columbia, major in mathematical statistics, “On the 
Problem of Non-Normality in Relation to Hypothesis Testing.” 

J. J. J. Lavallee, McGill, “Asymptotic Theorems in the Theory of Normal 
Correlation.”’ 

Fred W. Lott, Jr., Michigan, major in mathematical statistics, ‘““The Use of a 
Certain Linear Order Statistic Related to the Mean Difference, as an Un- 
biased Estimate of the Standard Deviation in Finite and Infinite Population.”’ 

Franklin S. McFeely, Virginia Polytechnic Institute, major in statistics, 
“Decision Procedures for the Comparison of Exponential and Geometric Popu- 
lations.” 

Henry Pratt McKean, Jr., Princeton, major in probability, “Sample Func- 
tions of Stable Processes.” 

Joseph Navarro, Purdue, major in mathematical statistics, “The Theory of 
Transect Sampling.” 

Joseph Edward Nelson, Chicago, major in mathematics, “On the Operator 
Theory of Markoff Processes.” 

Wesley L. Nicholson, Illinois, major in statistics, “Uniform Admissibility 
and Its Application to Classification Problems.” 

John Pratt, Stanford, major in statistics, “Some Results in the Decision 
Theory of One-Parameter Multivariate Polya Type Distributions.” 

K. V. Ramachandran, North Carolina, major in statistics, “On Certain Tests 
and the Monotonicity of Their Powers.”’ 

Paul Randolph, Minnesota, major in statistics, ““Multivariate Acceptance 
Sampling.” 

Bayard Rankin, California (Berkeley), major in statistics, “The Concept of 
Sets Enchained by a Stochastic Process and its Use in Cascade Shower Theory.”’ 

Elmer Remmenga, Purdue, major in statistics, “The Nature in Magnitude 
of Experimental Errors in Grazing Trials.” 

George Resnikoff, Stanford, major in statistics, “Contributions to Sampling 
Inspection by Variables.” 

Douglas Sherman Robson, Cornell, major in statistics, “Admissible and Mini- 
max Integer-valued Estimators.” 

Albert Rohloff, Purdue, major in mathematical statistics, “Sequential Tests 
of Composite Hypotheses.” 

Elmahdy E. Said, North Carolina State College, major in statistics, “‘A 
Comparison between Alternate Techniques Using Supplementary Information 
in Sample Survey Design.” 

Norman Carmen Severo, Carnegie Institute of Technology, major in mathe- 
matics, ““A Comparison of Tests on the Mean of a Logarithmico-Normal Dis- 
tribution with Known Variance.” 

D. A. Sprott, Toronto, “Balanced Incomplete Block Designs.’’ 

Roebert L. Stearman, Johns Hopkins, major in biostatistics, “‘A Statistical 
Estimation of Variation Encountered in Studying Dispersion of Radioactive 
Phosphorus in Embryonated Chicken Eggs.” 
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George Powell Steck, California (Berkeley), major in statistics, ‘Limit 
Theorems for Conditional Distributions.”’ 

Balkrishna V. Sukhatme, California (Berkeley), major in statistics, ‘“Testing 
the Hypothesis That Two Populations Differ Only in Location.” 

B. D. Tikkiwal, North Carolina State College, major in statistics, ‘“‘Multi- 
phase Sampling on Successive Occasions.” 

John Tischendorf, Purdue, major in mathematical statistics, “Linear Psti- 
mation Techniques Using Order Statistics.” 

Donald Truax, Stanford, major in statistics, ‘“Multi-Decision Problems for 
the Multivariate Exponential Family.” 

Howard Gregory Tucker, California (Berkeley), major in mathematics, 
“Contributions to the Mathematical Theory of Accident Proneness with Par- 
ticular Reference to Multiple Exposure.”’ 

Elizabeth Vaughan, Stanford, major in statistics, “Estimation of a Bio- 
logical Population which is Subject to a Biased Mortality.” 

Eleanor Weiss, Harvard, major in education, ‘Factor Analysis of Mathe- 
matical Ability.” 

Irving Weiss, Stanford, major in statistics, “Limiting Distributions in Some 
Occupancy Problems.” 

Oscar Wesler, Stanford, major in statistics, “A Modified Minimax Prin- 
ciple.” 

John 8S. White, Minnesota, major in statistics, “Some Problems of Point 
Estimation in Stochastic Difference Equations.” 

Martin Bradbury Wilk, Iowa State College, major in statistics, ‘‘Linear 
Models and Randomized Experiments.” 

R. Lowell Wine, Virginia Polytechnic Institute, major in statistics, “A Power 
Study of Multiple Range and Multiple F Tests.” 

Ralph Wormleighton, Princeton, major in statistics, “Some Extensions of 
the Sign Test.”’ 


New Members 
The following persons have been elected to membership in the Institute 
November 10, 1955 to February 10, 1956 


Adams, John W., B.S. (Univ. of Nebraska), Associate Engineer Process Development, 
Westinghouse Atomic Power Division, Westinghouse Electric Corporation, Bettis 
Plant, P. P. Box 1468, Pittsburgh 30, Pennsylvania. 

Blachman, Nelson M., Ph.D. (Harvard), Specialist, Communication Theory and Com 
puters, Electronic Defense Laboratory, Sylvania Electric Products, Inc., Box 205, 
Mountain View, California. 

Bradley, James V., M.A. (Univ. of Virginia), Research Psychologist, Controls Section, 
Psychology Branch, Aeromedical Laboratory WCRDP-1, Wright Air Development 
Center, Wright-Patterson AFB, Ohio, Apt. 7, 215 Dayton St., Yellow Springs, Ohio 

Choi, Yun Shick, Ph.D. (Seoul National Univ.), Head of Dept. of Mathematics, Seoul 
National University, Seoul, Korea, 6127 S. Greenwood Ave., Chicago 37, Illinois (until 
May 1956 and thereafter Seoul National University). 

Feinstein, Anita J., M.S. (Univ. of Miami), Statistician Research Aide, The Marine Labo 
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ratory of the University of Miami, Coral Gables, Florida, 489 Anastasia Avenue, Coral 
Gables, Florida. 

Frishman, Fred, B.B.A. (College of the City of New York), Statistician, Research and 
Development Department, Naval Powder Factory, Indian Head, Maryland. 

John, P. W. M., Ph.D. (Univ. of Oklahoma), Asst. Prof. of Math., University of New 
Mexico, Albuquerque, New Mexico. 

Mandelbrot, Benoit, B.S. (University of Paris), Visiting Ass’t. Professor, Charge de Cours 
a la Faculte des Sciences, University of Geneva, Geneva, Switzerland. 

Mathis, Harold F., Ph.D. (Northwestern), Research Engineer, Dept. 453C, Goodyear 
Aircraft Corporation, Akron 15, Ohio. 

Mattson, Thomas B., B.S. (Carnegie Institute of Technology), Graduate Student and Re 
search Assistant, Carnegie Institute of Technology, Schenley Park, Pittsburgh 13, 
Pa., 1694 Dormont Ave., Pittsburgh 16, Pa. 

McLemore, B. H., M.A. (University of Illinois), Student, Stanford University, Stanford, 
California, Sequoia Hall, Stanford University, Stanford, California. 

Nadler, Jack, A.M. (Univ. of Chicago), Student, University of Chicago, Chicago, Illinois, 
1368 East 57th St., Chicago 37, Illinois. 

Narayana, T. V., Ph.D. (Univ. of N. C.), Lecturer, Department of Mathematics, McGill 
University, Montreal, Canada. 

Neff, John D., M.S. (Kansas State College), Student, University of Florida, Gainesville, 
Florida, 1105 N. W. Third Avenue, Gainesville, Florida. 

Pasternack, Bernard S., B.A. (Brooklyn College), Graduate Assistant, Dept. of Experi 
mental Statistics, N. C. State College, Raleigh, North Carolina, Bor 5457, State College 
Station, Raleigh, North Carolina. 

Pitt, Hyman, M.S. (University of Wisconsin), Statistician, Instructor in Math., Univer- 
sity of Wisconsin, Madison, Wisconsin, 37 Merlham Drive, Madison 5, Wisconsin. 

Riffenburgh, R. H., M.S. (College of William and Mary), Instructor in Math., Research 
Fellow, Virginia Polytechnic Institute, Blacksburg, Virginia, 1005 Draper Road, Blacks- 
burg, Virginia. 

Saunders, David R., Ph.D. (Univ. of Ill.), Research Associate, Educational Testing Serv- 
ice, 20 Nassau Street, Princeton, New Jersey. 

Sawrey, William L., Ph.D. (Univ. of Nebraska), Asst. Prof., Department of Psychiatry, 
University of Colorado School of Medicine, 4200 E. 9th Avenue, Denver, Colorado. 

Sherman, Bernard, Ph.D. (Princeton), Assistant Research Mathematician, Numerical 
Analysis Research, University of California, Los Angeles 24, Calif. 

Shuford, D. B., A.B. (Univ. of Ill), Teaching Assistant, Department of Economics, Uni- 
versity of Illinois, Urbana, Illinois. 

Slimak, Romuald, B.Sc. (University of London), Statistician, Electronic Computer De 
partment, Univac Division, Sperry-Rand Corporation, 315 4th Ave., New York, N. Y. 

Tikkiwal, B. D., Ph.D. (N. C. State Univ.), Assistant Statistician, Institute of Statistics, 
N. C. State Univ., Raleigh, North Carolina. 

Titchen, Robert S., D.S. (University of Paris), Operations Analyst, Operations Evaluation 
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REPORT OF THE PRINCETON MEETING OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


The sixty-ninth meeting of the Institute of Mathematical Statistics, an 
Eastern Regional meeting, was held at Princeton University on April 20-21, 
1956. All sessions were joint with the Biometric Society (ENAR). An invited 
address on Some Problems Connected with Statistical Inference was given by 
LD). R. Cox of the Universities of Cambridge and North Carolina. 

The following members of the Institute attended: 


Frank Akutowicz, W. R Allen, T. W. Anderson, Beverly E. Arens, Harvey J. Arnold, 
Allan Birnbaum, C. I. Bliss, Albert H. Bowker, R. A. Bradley, Bradley Bucher, Mrs. M. 
B. Carroll, Jerome Cornfield, D. R. Cox, E. L. Cox, C. Daniel, Besse B. Day, A. P. Demp- 
ster, C. Derman, C. W. Dunnett, A. R. Eckler, C. Eisenhart, W. Feller, 8. M. Free, D. P. 
Gaver, Seymour Geisser, D. M. Gilford, M. H. Gourary, B. G. Greenberg, 8. W. Green- 
house, Max Halperin, W. C. Healy, Jr., Gerald C. Helmstodter, Robert Hooke, J.S. Hunter, 
J. Edward Jackson, M. V. Johns, Jr., G. H. Kennedy, A. W. Kimball, Clyde Kramer, T. 
E. Kurtz, F. Loro, E. Lukaes, M. D. Lum, Paul Meier, Sutton Monro, M. Morrison, J 
Moshman, R. B. Murphy, L. F. Nanni, W. L. Nicholson, L. M. Noel, G. E. Noether, M. L. 
Norden, Emanuel Parzan, Henry Polowy, Mrs. L. K. Randolph, Bayard Rankin, W 
Richardson, W. L. Roach, Jr., Harry M. Rosenblatt, J. R. Rosenblatt, D. E. Sands, A. E. 
Sarhan, F. E. Satterthwaite, R. Sitgreaves, W. L. Smith, M. Sobel, Herbert Solomon, 
P. N. Somerville, Frederick F. Stephan, H. C. Sweeny, Z. Szatrowski, F. B. Taylor, J.G.C. 
Templeton, Milton E. Terry, L. J. Tick, L. R. Tucker, J. W. Tukey, M. C. K. Tweedie, 
D. L. Wallace, Irving Weiss, Frank Wilcoxon, M. B. Wilk, R. Lowell Wine, W. J. Youden, 
Marvin Zelen. 


The program follows: 
FRIDAY, APRIL 20, 1956 

8:30 a.m. Confidence Techniques in Multiple Regression 

Chairman: Martin B. WILK, Princeton University 

1. Understanding Regression Analysis (Use of Characteristic Roots), M. A. EFrroyMson, 
Esso Research and Engineering Company. 

2. Some Approximate Confidence Procedures, Davip L. Wauuaceg, University of Chicago 

Discussant: Joun W. Tuxey, Princeton University. 

10:00 a.m. Paired Comparisons 

Chairman: Miutron E. Terry, Bell Telephone Laboratories 

1. The Discriminal Dispersion in Thurstone’s Paired Comparisons Model, Rosert P 
AsE.son, Yale University. 

2. Methods for the Rank Analysis of Paired Comparisons, Raupu A. Brapuey, Virginia 
Polytechnic Institute. 

3. Use of Scheffé’s Analysis of Variance for Paired Comparisons in Factorial Experiments, 
Ortro Dykstra, Jr., General Foods Corporation. 

4. An Evaluation of Some Statistical Techniques Used in the Analysis of Paired Compari- 
sons Data, J. Epwarp Jackson, Eastman Kodak Company. 

2:00 p.m. Statistical Problems in Poliomyelitis 
Chairman: 8. M. Frees, Smith, Kline and French Laboratories. 


1. A Rapid Approximate Statistical Procedure for Estimating Potency in Experimental 
Poliomyelitis Vaccine, Joseru L. Cim1nera, Sharpe and Dohme. 
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2. Quality Control Aspects of Safety Testing, Pau Meter, Johns Hopkins University. 
3. Observations on the Effectiveness of the Poliomyelitis Vaccine in North Carolina in 1955, 
BERNARD G. GREENBERG, University of North Carolina. 
Discussants: Max Hauperin, National Heart Institute, and W. J. Haux, U. S. Public 
Health Service. 


4:00 p.m. Contributed Papers I 
Chairman: Gottrriep E. Noeruer, Boston University. 


. On Certain Stabilities of Sample Survey Response, (Preliminary Report), Davip Rosen 
BLATT, American University. 

. Some Multi-Level Continuous Sampling Plans,C. Derman,S. Lirraver, and H. Sotomon, 
Columbia University 

. Applications of Vector-Valued Risk Functionals, (Preliminary Report), ALLAN Brrn- 
BauM, Columbia University. 

. On the Use of Randomization in the Investigation of an Unknown Function, RoBEert 
Hooke, Westinghouse Electric Corporation. 

. An Extension of a Method of Making Multiple Comparisons, Tuomas E. Kurtz, Princeton 
University. 

. On the Power of Some Rank Order Two-sample Tests, JoaN Raup ROSENBLATT, National 
Bureau of Standards. 


8:00 p.m. Invited Address 
Chairman: B. GREENBERG, University of North Carolina. 


Some Problems Connected with Statistical Inference, D. R. Cox, University of Cambridge 
and University of North Carolina. 


SATURDAY, APRIL 21, 1956 


8:30 a.m. Analysis of Variance and Experimental Design 


Chairman: Mavis Carro.t, General Foods Corporation. 


1. Use of a Concomitant Variable in the Selection of An Experimental Design, D. R. Cox, 
University of North Carolina. 
2. The Combination of Tests of Significance for Incomplete Block Design, Marvin ZELEN, 
National Bureau of Standards. 
. Estimation of Individual Variations in an Unreplicated Two-Way Classification, THomas 
RussE LL, Virginia Polytechnic Institute. 


10:30 a.m. Some Problems in Estimation 

Chairman: W. H. Ort, Merck Institute. 

1. Some Statistical Properties of Inverse Gaussian Distributions, M. C.K. Tweeptg, Virginia 
Polytechnic Institute. 

2. Composite Estimation in Repetitive Surveys, JoseruH STEINBERG, Bureau of the Census. 


3. Maximum Likelihood Estimation of a Mean in the Case of Unequal Variances, NATHANIEL 
MANTEL, National Cancer Institute. 


2:00 p.m. Contributed Papers II 
Chairman: A. Brrnspaum, Columbia University. 


1. The Comparative Importance of the Order of an Observation in Determining Linear Esti- 
mates of the Mean and Standard Deviation of the Normal Distribution from Censored 
Samples, N < 10, A. E. Saruan and B. G. Greensera, University of North Carolina. 
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2. Relations Between Stochastic Processes, Dr. BAYARD RANKIN, Massachusetts Institute 
of Technology. 

The Two Sample Multivariate Problem in the Degenerate Case, (Preliminary Report), 
A. P. Dempster, Princeton University. 

A Test for Independence in Contingency Tables, (Preliminary Report), James G. C. 
TEMPLETON, Princeton University. 

5. Incomplete Block Rank Analysis: 2?-4 Fractional Factorials Using a Method of Paired 
Comparisons, Orro Dykstra, Jr., General Foods Corporation (introduced by Cuth- 
bert Daniel, New York City). 

i). On Mixed Single-sample Tests, LHONARD ConeNn, Columbia University. 

ALLAN BIRNBAUM, 
Associate Secretary 


$$$ $$ $$$ 


REPORT OF THE CHICAGO MEETING OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


The seventieth meeting, a Central Regional meeting, of the Institute of 
Mathematical Statistics was held at the University of Chicago, Chicago, Illinois, 
on April 27-28, 1956. The meeting was in conjunction with a regional meeting of 
the Biometric Society. 

The following 72 members of the Institute attended: 


Om P. Aggarwal, Virgil L. Anderson, Kenneth J. Arnold, John L. Bagg, R. R. Bahadur, 
Theodore Alfonso Bancroft, Elizabeth K. Banks, Richard E. Beckwith, Joseph Berkson, 
Helen Bozivich, K. A. Brownlee, Edward C. Bryant, D. L. Burkholder, Irving W. Burr, 
Chin Long Chiang, Yun Shick Choi, Willard H. Clatworthy, Donald A. Darling, Allen T. 
Craig, Edwin L. Crow, Herbert T. David, Morris De Groot, Joseph A. Dubay, Charles E. 
Gates, John P. Gilbert, Edwin L. Godfrey, William A. Golomski, Gene H. Golub, Leo A. 
Goodman, Franklin A. Graybill, John Gurland, Robert Hogg, Paul G. Homeyer, W. H. 
Horton, Paul Irick, R. J. Jessen, Leo Katz, Morris Katz, Lloyd A. Knowler, Carl F. Kos- 
sack, Julius Lieblein, Albert Madansky, Margaret P. Martin, Harlley E. McKean, Jack 
Nadler, Monroe L. Norden, H. W. Norton, Junjiro Ogawa, Ingram Olkin, M. Vasudeva Pai, 
John W. Pratt, Paul H. Randolph, Paul R. Rider, W. C. Ross, Jagdish Rustagi, L. J. 
Savage, Esther Seiden, Gordon R. Sherman, Morris Skibinsky, James H. Stapleton, Stanley 
Stavropoulos, Charles R. Sutermaster, Z. Szatrowski, Joseph V. Talacko, Donovan J. 
Thompson, W. A. Thompson, Jr., G. Tintner, Sylvanus A. Tyler, M. E. Turner, Willem 
Vanderbyl, David L. Wallace, W. Allen Wallis. 


The program of the meeting was as follows: 


FRIDAY, APRIL 27, 1956 


8:30 a.m. Informal Round Table Discussion on the Use of Experimental 
Design and Regression Analysis in Industrial Research Problems. 


Chairman: Cuarues Hicks, Purdue University. 
10:30 a.m. Sufficient Statistics in a First Course in Mathematical Statistics. 
Chairman: Leo Katz, Michigan State University. 


Papers: 1. Some Uses in the Theory of Sampling, ALLEN T. Craic, State University of 
Iowa. 
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2. Certain Extensions and Applications to Statistical Inference, Ropert V. 


Hoaa, State University of Iowa 


1:30 p.m. Joint Session with Biometric Society I. 


Chairman: Tuomas Park, University of Chicago. 


Paper: 1. Egg-laying—egg-eating Stochastic Processes for Flour Beetles, Cu1n Lona 
Cu1ana, University of California, Berkeley. 


2:00 p.m. Joint Session with Biometric Society II. United Nations—F.A.O 
International Statistical Training Centers. 


Chairman: T. A. Bancrort, Iowa State College. 


Papers: 1. The New Delhi Center, J. Lusu, Iowa State College. 
2. The Buenos Aires Center, R. J. Jessen, lowa State College; Donovan 
Tuompson, University of Pittsburgh. 
3. Plans for the Mexico City Center, P.G. Homeyer, Iowa State College 


3:30 p.m. Contributed Papers. 
Chairman: K. J. ARNoup, Michigan State University. 


1. A Note on Ranking Means, W. A. THompson, Jr., Fort Bliss, Texas. 

2. Multivariate Ratio Estimation for Finite Populations, INGRAM OLKIN, Uni 
versity of Chicago and Michigan State University, (By title). 

3. Runs Above the Sample Mean, Hersert T. Davin, University of Chicago 

. Approximations to the Power of Rank Tests, Cu1a Kurt Tsao, Wayne Uni 
versity, (By title). 

5. On Maximizing and Minimizing a Certain Integral with Statistical Applica 
tions, Jacp1sH 8S. Rustaai, Carnegie Institute of Technology. 

5. Seasonal Forecast of Some Time Series, Josern V. Tatacko, Marquette 
University. 

. Quadratic Time Homogeneous Birth and Death Processes, PeteER W. M 
JouN, University of New Mexico, (By title). 

. A Method of Multiple Rank Correlation, Hans TuetL, Netherlands School 
of Economics and University of Chicago, (introduced by D. L. Wallace) 

. Post Stratification in Multistage Sampling, W. H. W1vi1aMs, Iowa State 
College, (introduced by T. A. Bancroft). 

. Confidence Intervals for Variance Ratios Specifying Genetic Heritability, 
F. A. GrayBiLi, F. Martin, and G. Goprrey, Oklahoma Agricultural and 
Mechanical College. 

11. Moments of a Test Criterion For Outliers, P. A. Keys, Iowa State College, 
(introduced by T. A. Bancroft). 


SATURDAY, APRIL 28, 1956 


8:30 a.m. Regression and Functional Relations. 


Chairman: W. A. WaALLis, University of Chicago. 
Papers: 1. Problems Involving the Interrelations of Stochastic Variables, GERHARD 
TINTNER, Iowa State College. 
2. Linear Functional Relations with Variables Subject to Error, JouN GURLAND, 
Iowa State College. 
3. Some Remarks on Regression, JoserH BerKson, Mayo Clinic. 
Discussion: L. J. Savace, University of Chicago. 
Hans Tueit, Netherlands School of Economics, Rotterdam, and University of 
Chicago. 
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1:30 p.m. Joint Session with Biometric Society III. Applications of Electronic 
Digital Computers in Statistical Analysis. 

Chairman: Horace Norton, University of Illinois. 

Papers: 1. Correlation, Regression and Miscellaneous Techniques, G. H. Gouvus, Uni- 
versity of Illinois. 

2. Linear Programming, E. R. Swanson, University of Illinois. 

| 3. Analysis of Variance and Related Techniques, W. C. JacosB, University 

| of Illinois. 


I thank David L. Wallace, Assistant Secretary for the meeting, for providing 
the above information. 
WituiAmM KrvusKAL 
Associate Secretary 
ee 
THE INSTITUTE OF MATHEMATICAL STATISTICS 
Statement of Financial Condition 


December 31, 1955 
REPORT OF THE TREASURER FOR 1955 


(This statement went to press before completion of the final audit and 
is subject to minor changes.) 


ASSETS 
Current Assets 
Cash in Bank 


$6 ,997 .22 
Savings Bank Account 


2,040.11 
Investments (at cost) 52 ,488 .83 
Dues Receivable 239.25 
Due on Back Issue Sales 444.55 
Miscellaneous Accounts Receivable 9.50 
Inventory of Back Issues 19 , 235.62 
Subscriptions Receivable 92.75 


Total Assets $81 , 347.83 
LIABILITIES AND MEMBERS EQUITY 
Current Liabilities 

Account Payable, Printing of Dec. Annals 

Witholding and F.I.C.A. taxes payable 

Accrued Expenses 


$4,764.27 
119.82 
40 .93 
84.50 
1,296.16 


Amount held for Biometrika subscriptions 
National Science Foundation Grant 


Total Current Liabilities 6,305.68 


Liabilities to Members and Subscribers 
Advances on Dues, 1956, 1957 
Advances on Dues, Life Members 
Advances on Subscriptions 


$657 .22 
2,872.50 
5,825.15 


Total Liabilities to Members and Subscribers 9,354.87 
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Members Equity 


Reserve for Maintaining Supply of Back Issues $23 ,078 .52 
Surplus 42,808.76 


Total Members Equity 65, 687 .28 
Total Liabilities and Members Equity $81 , 547.83 


THE INSTITUTE OF MATHEMATICAL STATISTICS 
Statement of Income and Expenses 
for the period January 1, 1955 through December 31, 1955 
Revenues 


Membership Dues.. ; wait $14,258. 
Subscriptions. ......... 10,599. 
Sale of Back leoues 

Gross Income Pid jiiark ah $4,586.57 

Cost of Sales... 1 "972.84 2,613.7 


Income from Investments. . 1,763 .65 


Total Revenue... $29 , 234.43 


Expenses 


Printing of Current Annals: 
Cost of Current Issues Printed $14,508.29 
Inventory, December 31, 1955 of 1955 Annals 2,270.20 


$12,238 

Editorial Expense : 268 .! 
Miscellaneous Printing, Stationery, Postage 1,815.62 
Miscellaneous Office Expense sh 1,650. 
Salary 2,784 
Contributions rar oe 151.7: 
Meeting Expense ; 218.% 
President’s Office Expenses 44 3° 
Travel ee 206. 
F.1.C.A. tax ; mate 54.78 


Total Expenses a $19,428.10 

Excess of Revenues over Expenses 9,806 .33 

Minus Addition to Reserve for Maintaining g Supply of Back lesue Buc 1,326.00 

Increase in Surplus 8,480.33 
Surplus, December 31, 1954 $34 ,362 .54 

Less adjustment for F.I.C.A. tax 34.11 34,328.43 


Surplus, December 31, 1955 $42,808.76 
A. H. BowKeEr, 
Treasurer 
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PUBLICATIONS RECEIVED 


Apams, JoE Kennepy, Basic Statistical Concepts, McGraw-Hill Book Company, Inc., 
New York, 1955, xvi + 304 pp., $5.50. 

Baker, C., Technical Publications—Their Purpose, Preparation and Production, John 
Wiley and Sons, Inc., New York, 1955, 302 pp., $6.00. 

Binet, F. E., R. T. Lestiz, 8. Werner anp R. L. ANDERSON, Analysis of Confounded 
Factorial Experiments in Single Replications, N.C. Agr. Exp. Stat. Tech. Bull. No. 113, 
Institute of Statistics Reprint Series No. 70, 1955, 64 pp., $1.00. 

Bosse, R. C., W. H. Curatwortuy anp §S. 8. SuHrrkHanpe, Tables of Partially Balanced 
Designs with Two Associate Classes, N. C. Tech Bull., Institute of Statistics Reprint 
Series No. 50, 1955, $2.00. 

Cuac6n, P. Enrique, Curso de Estadistica, Volumen I, Estadistica Descriptiva y Modelo 
Matemdtico, Patronato de la Universidad de Deusto, Bilbao, Spain, 1955, xvi + 494 
pp., $7.00. 

Ke.iy, KeENNetH L. AND DEANE B. Jupp, The ISCC-NBS Method of Designating Colors and 
a Dictionary of Color Names, National Bureau of Standards Circular 553, 1955, 158 pp., 
$2.00. (Order from the Government Printing Office, Washington 25, D. C.). 

Roy Ren&, Cahiers du Séminaire d’Econométrie, No. 3—Les Mod?les économétriques, 
Centre National de la Recherche Scientifique, Paris 7*, France, 1955, 147 pp., Frs. 
750. 

Sen, A. R., R. L. ANpeRSON AND A. L. Finkner, A Comparison of Stratified Two-Stage 
Sampling Systems, N. C. Tech. Bull., Institute of Statistics Reprint Series No. 54, 
1955, (Reprinted from J. Amer. Stat. Assoc., Vol. 49, No. 267, September, 1954). 

Table of the Descending Exponential, x = 2.5 to z = 10, National Bureau of Standards Ap- 
plied Mathematics Series 46, 76 pp., 50 cents. (Order from Government Printing Office, 
Washington 25, D. C.). 

Table of Hyperbolic Sines and Cosines, x = 2 tox = 10, National Bureau of Standards 
Applied Mathematics Series 45, 81 pp., 55 cents. (Order from Government Printing 
Office, Washington 25, D. C.). 
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Vol. VI 
L. A. SANTALO 


ProcoPpio ZoROoA 
Jesus DE LA SALA 


Notas 


A. Diaz Uneoria, A. Camacuo y 8. Rios 
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CONTENTS Cuad. II 


L’etat actuel du probléme de Behrens-Fisher 
The mean successive difference in samples from an exponential population. 


Estudio de un problema de distribucién de mineral. 
Un disefio factorial aplicado al estudio de la productividad, 
Bibliografia Cuestiones y Ejercicios 


CONTENTS Cuad. III 


Sobre la distribucién de los tamafios de corpGsculos contenidos en un cuerpo a partir de 
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Simplex al problema del transporte de Hitchcock. 
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El an4lisis operacional en la propaganda 
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Studies in Econometric Method (Hood and Koopmans, eds.) Review by Olav Reiers¢l 

An International Economic System (J. J. Polak). Review by John 8. Chipman 

Approzimations for Digital Computers (Cecil Hastings, Jr., e¢ al.). Review by Harry Goheen 

Short-Term Economic Forecasting (Studies in Income and Wealth Vol. 17) (N.B.E.R.) Review by Ralph Turvey 

An Introduction to Stochastic Processes with Special Reference to Methods and Applications (M. 8. Bartlett). 
Review by J. Wolfowitz 

The Theory of Economic Dynamics (M. Kalecki). Review by John 8. Chipman 

Post-Keynesian Economics (K. K. Kurihara, ed.). Review by Kazuo Midutani 
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Published Quarterly Subscription rates available on request 
The Econometric Society is an international society for the advancement of economic theory in its 
relation to statisties and mathematics 
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